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ABSTRACT 

This cfonference was organized around four policy 
concerhs: (1) the needs of handicapped, gifted, and bilingual 
students; {2J the use of 'test results to allocate federal 
compensatory education funds: (3)' t^e validity of ainimum competency 
testing; and (M) the demand for increasingly- sophieticated • 
evaluations. Ga.tfy L. McDaniels discussed the Education for All 
Handicapped Children Act; James J. Gallagher presented issues on the 
identification and education cf gifted children; and Haxia Medina 
Swansea described progress made in bilingual educatior since the 
Bilingual Education Act and the Lau v Nichols 'court decision. Joel S. 
BeiJce introduced the .topic of funding allocationfi; Fred £- Burke 
presenrted New Jersey's rationale for combining test ^cores and 
socioeconbmic status, as an index fot compensatory \progra« funding; 
and George P. Madaus supported the use of statewide ncr a- referenced 
achievement teats. Mar^k Ri siTedd and B, Robert Bentz described 
minimum competency testing prograi* in Cpnnecticut high schools and 
In the Georgia state colleges, respectively. Fijially, Peter H. Rossi 
stressed thfi need for recognizing when net to evaluate, the 
difference bet weeift-" pilot and fall-scale programs, and complications 
in calculating cost-effectiveness: and John Ellis illus,trated how 
evaluations have influenced ccngressicnal appro priatlcns. Th«. 197 9 
Educational Testing Service Beasurement award was presented to John 
c. Flanagan. (CP) 
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introduction 



Demandf by educational policy makers for applications of measure- 
ment to significant new tasks are having far reaching effects oq educa- 
tion and measurement. Measurement professionals are^being ask^d to 
^help in designing e^cational programs for, among otheny^han<|U^ 
capped children, gifted children, and bilingual children^ These profes- 
sionals are also being asked how measurement can heljp in allocating 
funds to schools, determining qualiRcatiohs for high school diplomas 
and college degrees, and ivaluatingthe worth of new educational pro- 
grams. Thes^ new and con^plex demands have given rise to congres- 

' sional debate, federal and state conferences, and extensive discussion 
atld developmental work by measut'ement specialists. Measurement 
and educational policy is the theme of this volume, which includes the 
ten papers presented at the 1978 Educational Testing Service Invita- 
tional Conference. 

Current educational policy is characterized by concern with the 

* needs of special student groups^ The flrst three chapters by Garry R. 
McDaniels, James J. Gallagher, and Maria Medina Swanson on th]j| 
. lopic consider handicapped students'^ gifted students, and bilingual 
students. Each of these gmups of students presents a different set of 
challenges to existing mnsurement capabilities. These chapters indi- 
cate that j)rogress is beif^ made in meeting these challenges and the 
critical next steps that need to be taken are identiHed. 

^ B^iluse funding is of central importance in the operation of 

schools, the possibility thilt test data might constitutie a useful compo- 
nent' in formulas for allocating educational funds is currently the sub- 
ject of vi^rous discussion. The chapters by Joel $. Berkir, Fred E. 
Burke, and George F. Madaus on this subject point out pitfalls and 
safeguards in this use of tests based on both measurement and policy^ 
considerations. » 

The use of tests for evaluating and certifying achievement has a 
long and honorable history .fwhat is new is a strong movement toward 
developing and using sla((iwide minimum competency tests for high 
school students. At the college level, there has beeq a long-term trend 
toward greater structuring of state-supported systems of higher educa<^ 
tion. New developments in statewide testing of high school students 
and of cdilege students are discussed in separate chapters by Mark R. 
Shedd and R. Roberti^Rentz. 

Perhaps the most pervasive relationship between measurement 
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and cdomional policy arises from the clqsc association between pro- 
- grarn evaluation and measuijement. The demand of policymakers for 
incrtasingly sophisticated evaluations of innovations and interventions, 
is generating new ahd difficult tasks for measurement. Two aspects of 
^valuation are discussed in this volume. The chapter by Peter H. •.ossi 
examines the strategy of choosing appropriate programs to evaluate so # 
that evaluation resources^ay be used most effectively. The chapter by 
John Ellis provides an insight into the way in which a variety of consid- 
erations, including program evaluations, interact in reaching decisions 
about federal programs. ^ ^ 

The program for the 1978- ETS Invitational Conference was 
planned by: Scarvia B. 'Andersoni^chairperson), Joan C. Baratz, Jack 
R. Childress. James R, Deneen. Winton H. Manning". Samuel J. 
Messick. Warren W. Willingham, and Jane D. Wirsig. ) 

The papers presented at thf 1978 ETS Invitatiopil Conference ' 
provide impressive evidence that the educational comi^nity is looking 
to measurement for help in coping with emerging policy questions and 
many able measurement people are responding admirably to these 
demands. 

William W. TurribuU 
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The citation for the 1978 ETS Amrdfor 
Distinguished Service to Measurement 
summarizes Dr. Flanagan's many 
contributions as scientist, 
scholar, and administrator. 



^citation: John C. Flanagan 



John C, Flanagan has been well-known to several generations of gradu* 
ate students for his wide-ranging technical and scientific achievements-^;' 
from innovative research techniques to psychometric derivations to ^ 
seminal books and articles. Over the years, those same students' matur- 
ing as researchers and professionals, have come to marvel at his ability 
%to translate scholarly work into pioneering applications of social set- 
. ^?'ericc— certainly the hallms^k of his distinguished career. 

After serving several years as associate director of the Coopera- 
tivc Test Service, Dr. Flanagan organized and directed, from 1941 to 
1946, the Aviation Psychology Program of the Army Air Force; this 
program was a giant undertaking developed to apply scientific meth- 
ods of psychological measurement in the selection of pilots during World 
.War II. A demonstrable increase in pfrcdictive validity in that selection 
along with a decreaM in aircraft accidcnu justify characterizing this 
work as one of (he dramatic Success stories of applied psychology. 

As fothul^a for most of the past thirty years, chief execu- 
tive officer of Aifterican Institutes for Rescas^ (AIR), he expanded his 
military research experience iiito a wide range of social applications. 
Of the hundreds of projects initiated by AIR under his leadership, per- 
haps the niosl; significant was Project TALENT, the first cortiprehen- 
sivc longitudinal study of educs^tional development. That )vork led to 
Project PLAN, the first comprehensive computer-based program for, 
prescribing, monitoring; and evaluating the learning progress of indi- 
vidual students throughout an entire scho|)l system. Finely, it is char-. 

Niw Diwiiamfor Tislmg mnd MMUfim$nt» 1979 ^ ^ ' ■ 
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acteristic of j6hn*Flanagans vision and intellectual breadth th-at his 
attenuon has turned most recc'nt^y to the quality of life in American 
society; his chief professional concerns now are how to assess the qual- 
ity of life and How to develop educational and social strategies for its 
improvement. <'^j.', » 

. It is no surprise that John Sl&^an has i^ceived many hononi 

and citations. In addition, his profes^nal leadership has been recog- 
nued by his colleagues; he has been elected president or sectional vice 
president of a number of professiot^al organizations, includinir the 
Ari^ncan Educational Research Association,, the American AsSocia-. 
tion for the Advancement of Sqience. the Natidnal Council on Me}- 
sureraent in Educatiort. the Psychometric Society, and four different 
divisions of the American Psychological Association. 

' For his many Contributions to the theory and practice of educa- 
tional research and measurement, and for his productive career as scien- 
tist, scholar, and admimstrator. ETS has the honor to present the 1978 
Award for Distinguished Service to Measurement to John Flanagan. 

previous recipients of the ETS Measurement Award 

1970 E. F. Liirdquist ^ 

1971 Lee J. Cronbach^ 

1972 Robert L. I horndike , ' 



1973 Oscar L. Euros 

1974 J. P. Guilford 
197.) Harold. Gulliksen 

1976 Ralph Winfred Tyler 

1977 Anne Ahastasf 
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Legislation to meet the needs of handicaj^^bgd children 
may bring about massive upgradi^^f our use 
df existing measurement technology. 
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assessing handicapped students: 

beyond idenfitication 

/ garry* L rricdaniels 



The Education for AH Handicapped Children Act (Public Law 94-142) 
was signed intdlaw by President Ford in 1975; it was to be implemented 
by Septcmbej: l» 1978, for children between the ages of three and eigh- 
teen and by September 1, 1980, for children between the ages of th^ee 
and twenty one. This' act requires that children be assessed in order to > 
cietermine whether or not they are handicafipqd and to provide data 
for developing the individualized educational programs that they 
need. The measurement community should anticipate new demands 
on both its technology and its human resources as a t^k of the act. 
The purpose of this chapter is to identify two^eas of Weakness that 
may be uncovered by those demands; the weaknesses, in tt^rh,. suggest 
some new directions for the measurement of children over tixe next five 
to ten years. ^ * ^ ^ 

« establishing measurement guidelinei 

I - ** 

\ • 

The creators of Public Law 94- 142 assumed that 4here was a 
well-trained professional capacity in the United Stat<» inUhe area of 
measurement. They also assumed that this capacity was large enough 
' and distributfil widely enough to reach most of the childreh and youth 

StwDiTMi^HsfoTTtttmgffl^jllllkMrtmant, 1, 1979 ^ ^ ' ^ 



affected By the^ct. These assumptions ii^ust have<cxistcd, fbRin one'of 
its sections (P.L. 94.142: S*c. 612 (2) (c)) the act required jhe states to- 
institute procedures assuring that: 

• * ■ * ■ ' ' . 

All children Residing in the state who are hajidicapped, regard-, 
less of the severity of their handicap, and who arc in necjd of 
special education and related/scrvkeff are identiJped. located, 
and evaluated. . " , 

, ♦ ^ ■ ■ * - ' ' ■ 

These children number in the miUioi^ and reside in sj^^areas of" the 
• United Stat^. ' ' , «• 

This4>a8ic assumption is reasonable. For examile. one of the 
great accomplishments of psychologists during World War I (that ^ai 
greatly expanded duriAg World-War II) w|| the creatiA of Ihe large- 
scale jesting program. The group test, tfaii paper- and -iencil »rmat, 
and machine ifcoring were technological breakthroughsjthat provided ' 
highly trained psychologists with numerous assistants and thus relieved 
th^m bf direct contact with aipidiers except in unusual cases. Xhe huge, 
screening program of the military could not have been carried out using 
individual assessiBt^nt teirhnology. 

The measurement innovations developed in the middle of this 
century are commonplace in the United States today. Civilian uses of 
measurement devices arc extensiyc^ boUi schools and businesses, and 
the civilian work force created to administer those measurement device* 
is large. In addition, there is hardly a' college or university in the coun- 
try that docs not offer humerous courses in testing and measurement. 
Nonetheless, the creators of the Education for All Handicapped Chil- 
-dren Act were concer?ied about the abilities of the professional com- 
munity. The state? were directed to .^tablish: 

* 

Prbcedures ty assure that testing and evaluation materials and 
procedures utilized fpr the purposes of evaluation and place- 
ment of handicapped Children wiH be selected and adminis- 
tered in the child's native language or mode of communication,, 
unless it clearly is hot feasible to do so. 

in addition, the lawmakers directed that "no single procedure [should] 
be the sole criterion for determining an appropriate educational pro- 
gram for a child'*(P.L. 94-142: Sec. 6^2 (5) (C)). 

Trie act als<y requires that the work of psychologists be made 
more ptiblic. A provision of the legislation requires that the data used 
for child assessment and placCment be open to inspection by parents or 
guardians. A procedural safeguard that must be assumed by the states 



(P.L. 94 V42: See. 615 (B) (1)XA)) provides: 'an opportunity for ^hi? 
parents p^a handicapped child to examine all relevant records with 
tespect to the idehtification. evaluation, and educatiqg^al placement of 
[the] child and the provision of a free appropriate public'educationlo 
such [a] child, and to obtain an independent educational evaluation of 
the child/' This should be a boon for those measurement, experts and 
skilled counselors who have tried for years to encourage parents to 
examine their children's test results and to use such results as an aid in 
planning academic and vocational activities for thQrrx. If the ^jarents 
and guardians wish to challenge the interpretations of the available 
information, they are welcome to do so; ffie work of the measurement 
community is open to inspectioh and challenge. 

No pattern has yet appeared in the problems encountered by 
the measuren\ent community in implementing this act. Some anec- 
dotal evidence, however , identifies two possible areas of weakness: the 
selection and adminjstration of measurenjent instruments and the use 
of data in developing individual education plam. 

Selecting and Administering Instruments. The lack of compe- 
tence in the selectiop and administration of instruments can be illus- 
trated by several examples. For inkance, a consultant in special educa 
tion who has measurement expertise told us, "I conduct workshops 
with school psychologis.ts and I ask: How many of you use standardised 
tests?' All hands go up. Then I ask: 'On what groups were these tests 
standardized?'* No hands go up." We ^Iso havf reports that people; are 
alteFing standardized test procedures for various disabilities witb no 
regard for the slccompanying need^ to"^ modify published norms. 
A"nd there arp some reports that people are using assessment devices 
that have low reliabilities. 

It is perhaps too early to say that such isolate^/^vcnts constitute 
a pattern. I hereare! however^ definite problems. In 1978, ^"panel was 
called together by tlie Bureau of Education for the Handicapped to 
develop criteria for evaluating the quality of local education agency 
assessment programs. There was coi^sensus among this group that stde; 
quatf prtnciples now exist wbith,if implemented inipractice. Would 
make substantial progress in eliminating the measUi;ement pfoblems 
most frequently cited. 

Given the reasonably wcllnleveloped technology of assessment 
and the extensive institutional training available to those pursuing 
careers in measuremeat, such pattei^hs should not develop. A renewed 
commitment is needed from the Wieasuremetit community to widely 
publicize the standards of its profession. And* if problems persist, tnore 
training for more people may also be needed, possibly supplemented 
by sanctions for unproffisional performance. 



ERIC 



Developing Individual Educatioi} ' Programs. The lawmakers 
bclievr in the heterogejicous nature of children, and they believe that 
the measurement community has the capability to document their idio- 
syncratic characteristics. This assumption of hetei-bgtmeity underlies 
the mandate to develop individual. education programs. 

Although Public Law M-142 asks that children receiving ser- 
vices be counted in one of eleven categories (with specific learning dis- 
abilities, visually impaired, deaf, and so on), these characterizations 
have no function beyond identifying handicapped children. In fact, 
some states have dropped the characterizing definitions (these include 
Louisiana. Massachusetts. South Dakota, and Wyoming). "However, 
three major problems have arisen as a result of these counting categor* 
/ ies. First, some assessment strategics may not go beyond affirming or 
rejecting a child's eligibility for inclusion in a certain category - a way 
of testing teachers- suspicions that certain children l?elong in particular 
categorips^/To redUce the occurrence of such situations, the Regula- 
•tions* require that: . . ^ . 

, > I,- . . ^N. _ » . 

The child is assessed in all areas related to the suspected disabil " 
ity. including, where appropriate, health, vision, hearing. sAcial 
and emotional status, general intelligence, academic perfor- 
' mance. communicative status, and motor aKHities. 

The second problem concerns the relationship between identify- 
ing handicapped children and determining their educational needs. 
Simply confirming a cWld's etiological characterization dotjs Httle to 
help eduodtipwal personnel develop a program for that child. Confirm- ^ 
mg a Child^ ; loss of hearing, for exampjc. does little lo delineate the 
idiqsyncratic educational needs of that child. Measurement experts 
have discussed this paradox at great length in this decade: The criterion- 
referenced instrument is increasingly being recommended as a means 
fo^ documenting a student's competency in specific skills, but practice in 
using this kind of instrument appears to bc^lagging behind its advocacy. 
Third, and perhaps of more current conceSrn. some standard 
^"treatments " have .become associated with various etiological charac- 
terizations of children. Some attempts to assess a child's needs seem to 
be dictated by initially assuming that' this person is somehow homoge- 
neous with up to a million other children in the counting category. Par- 
ents of some autistic children have reported to us, for example, that 

* 

^ KrKulatioris F.(lucatioiM)f Han(li( a|v>rd Childrpn. Impl^^^^ 

tion of Fart D of thr Kdu. a«ion of thr Hancli.ap|K.cl Act. Washington. D.C.- Ftdtral 
RtgiMtr, August 23. 1977, Pm II 1121A.Ij32 (F)J. 



since their ihiUlren have been grouped lor countiug' purposes und(^r 
"enuJtional diHtur^aiuc/' the treatment strategies have been primarily 
psyehological iri character. As a result, the child's needs are assessed by 
a. psychologist or psychiatrist who anii( ipAtes a dynamic treatment; 
thus they become ic^ntil ied with the concepts and language oj dynamic 
^psychology, which may not lead to the kind of treatment they really 
tieei^l, ^ 

I he strength ot the assumption of heterogeineity will undoubt- 
edly be seen in court cases lln t)ie next decade. For instance, a suit is 
currently being brought dj/axmi a local school district by the Michigan 
Association for ' Retarded Citizens, This class* action alleges that the 
defendant "%as faijcd to provide institutionalized special cducatiojx 
because the education provided has been directed toward chronologic 
cal groups rather than toward students* individual needs.^ Obviously, 
the measurement community canript be driven by an assumption that 
.views children and youth who are handicapped as homogeneous. 1 he 
data coUecteci on a child's condition or needs cannot be restricted by 
a /inon dec isions about the category in which the child might be placed 
or about the treatment possibilities that exist. 

conclusion 

»*. 

1 he Education for All Handicapped Children Act represents a 
significant challenge aild opportuni^ for the measurement commu- 
nity. Rather than calling for improvements in measurement tcchpol- 
"ogy, the major issues in implementing the act relate to measurement 
practice and personnel training. However, the technical skills of tpea^ 
surement personnel will be on display as a result of Public Law 94- H2; 
the measurement strategieVs^hey em|i!oy will have to respond to the 
assumption on which the act is based that childr^i are heterogenepus. 
That kind of response will require thoughtful, competeilt professionals. 
I hus, the.nevy direction in measurement may be a movement toward 
the massive upgrading of our capacity to utilize existing technol(^j|y. " 



Garry L. McDaniels is director, Dtviston of 
Innomiion and Development, Bureau of 
Education for the Handicapped^ 
U S, Office of Education, 



New imtruments, increased reseqrfh funding, and 
better wayg^of taking account ef di^erences in 

' envirpnmemk are necessary if we areHo identify 
' and serve the educdlional needs <)/ the gifttd 
and talented in our society. 
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measurement issues in 
programs for gifted students 

iames j: gallagher 



The future shape of education for gifted and talented children in the 
United States depends on a number of factors: the ability of educators 
to conceptualize the special needs of at^d program adaptations for these 
children; tljc ability to demonstrate and evaluate meaningful progress 
in special programs for the gifted; devdopments in the rest of the edu- 
cational system (desegregation, accountability, and so oh); and the 
attitude of the general public about the desirability and importance of 
special education for the gifted. In this chapter, I shall examine jome 
critical measurement issues that influence the future course of such 
education efforts. -> 

What individual Communities and American society as a whole 
decide to do about providing special educational experiences fpr gifted 
childten proba];)ly depends more on 89C^«^*t Attitudes ajpd values than 
on educational ii^novations. Gallagh^4d7ij identified four broad 
forced that are alive In our society and that have influenced such action 
in the past:v 

1 , Egalitarianism. There is a strong belief in the need to give all 
citizens equal treatment and equal opportunity and ajelated determi- 
nation that there be no "special privilege* for special people." Such 
attitudes, narrowly applied, can hinder special provisions for gifted 
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children, especially since "equal education" often gets translated into 
"identical education." 

2. Universal Education. The commitment of the United States 
to full education for all children through high school has kept many 
children of limited ^ility in school. That situation has created a range 
of talents and achievement at junior and senior high scho^ages that is 
difficult to manage- within a single classroom. Much of the pressure for 
special provisions for the gifted is a recognition of such extraordinary 
stucjent diversity and the problems it creates for the conscientious 
teacne!r-N»__il 



^ ' Decentralizaiion of Educational Decision Making. When each 
separate school district makes its own major educational decisions, the 
need for sp<rcial education for the gifted does not seem as pressing as 
Other, more immediate needs. There is greater opportunity for taking 
a longer range societal view at the staje and federal levels. The program 
stimulus for the gifted often 6omes from thpse levels of government. 

4. Sense of Societal Confidence. As long /s there js overconfi- 
dence in the ability of the United States to conquer any obitkle or solve 
any problem as it arises, then tiie pressure to provide s^lSial educa. 
tional help for the talented is quite low. When some of that overconfi- 
dcnccis lost,4hen there is increased pressure to build programs that 
would enhance the education of the most ulpnted students in the soci- 
ety (and thus (fnhartc&bur overall ability to meet and overcome crises). 

liiftfd education has profited, ironically, from World War II, 
tlie Sputnik crisis, and the current problems of energy, population pol- 
lution, and international conflict. Rtcognitioi>^of the social fprces that 
influence OT d<*terminc^ur educational policies is the first step toward 

. untftcrstanding the otherwise curious reluctance of the society to do 
more fpr the giftpd stwdcnt. : * , ^ 

gHted .education in Amerita' . 

s ■ In tjie long" history of Western riian, we have honored many 
•|iftcd individuals who provided us with new perceptions of humanity 
.and pf out onvironmcm.« Plato. Mendel, Copernicus, Fr^ud, Darwin, 
<urii;. S^«kespcati^,.Brojite, and Piagct have eacfi Jjhown us a different 
portrait: of Outseivca and our world. Those changing portrait*, in turn, • 
have resujted in itfajor transformations botti in our society and in civili- 
faljon. Ari'a although we often do not recognize them, beiow this level 
pfgetwusnre layt^rs of diher gifted and talented individuals who have 
m*dc si|fn|ricant, althougK less jfociety shaking, contributions. Thesci- 
*cn6fic discoveries, thcweative writingrlhe art, and the music that this 
a^ond whelon of gifted indJvidukls has produced have also played a 
niAjor role m ^rhangini^the total fal^ric 6f our civilization. 



Ignoring the education of these gifted ahd talented individuals 
cheats both them and the larger society of their true potential. Yet we 
hesitate when considering special education provisions for gifted and 
talented children, and we listen to counsel suggesting th^it the gifted 
will make their cqntributipns without any special educational aid or 
help. A strong case can be made for the presence pf a love-hate rela- 
tionship between gift'cdness or talent ^and American society. On the 
one hand, we revere the gifted individual who has risen from a humble 
background. We are proud to live in a society where talent can triumph 
over environment or family status. But on the other tiand, since our 

.origips cam<f>froixi battling an aristocratic elite, we arc suspicious of 
attempts to subvert our (commitment to cgalitarianism. We do not want 
a new elite to develop; as a result, we waver in our attitudes. We design 
our elementary and secondary programs for gifted students in ways 
that C2m be defended by cautious administrators as jjiving no special 
favors and not lipping the spiles in favor of the societally powerful or 
specially cndowt*d (Gardneij,*^1961). 

Kurt Vonncgut^Jr. (1950, p. 7) has carried one of the common 
feelings about the gifted ir) our society to a logical conclusion in his 

. short story, HarmoK Bergeron, which is set in sonic future society: , 

The year was 20^1, and everybody was finally equal. They 
^Weren't qnly equal i)cforc God and the law, they werc^cqual in 
every which way. Nobody was .smarter than anybody. No one 
was better looking than anybody else. 

The reason for this enforced equality was that people who were 
outstandii^ in various ways were given handicaps to equalize the soci- 
ety. There was a government ageq^y^ headed by the Handicapper 
General, whose job i^ was to enforce such equality. Those citizens who 
could- dance well had to wear sandbags on their feet; those who were 
stfikingly good looking had to wear masks so as not to embarrass those 
;who were not. And what about those with high intellectual ability? 

V 

George, while his intelligence was way above normal, had a litt W 
^mental handicap r^dio/in his car. He was required by law to 
wear it at all times. Every twenty seconds or so. the transmitter 
would send opt some noise to keep p^plt like George from tak*- 
ing Unfair advantage of their brains. 

The essentially destructive approach to ''equality** satirized by Vonnegut 
influences our feelings about the gifted until we reach higher cduca- 
tion, when a miraculous transformation takes place. 

/rhe United States has created the most coitiplflix and extensive 

I'm 
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higher education and jirofessibnal school cstabUshment in the world. 
We may tiot think of the curricular offerings pf the Stanford Medical 
School or the Harvard Law School as programs f*igiftrtl students, but 
we know that they are; and no apolbgies are made for the fact that only 
the "best" itudenti are allowed to attend. Aftpr all, some of us may 
need a good lawyer from time to time, others may need an excellent 
surgeon, and others woul^i like s«ne,good advicfe from a competent 
psychiatrist. 

current status 



The history of support for programs for exceptional children in 
the U^. Office of Education gives us soitae insight into the cultural 
problems of the gifted aflS.the talented i|WI^ sqciety. The federal gov- 
erntnent will provide over $900 million in fiscal year 1979 'to improve 
the education of school aged "handicapped children. These dollars are 
certainly niedc^; in fact, they do not provide all that handicapped 
children need i^i4he way of special education services. Ji^t during that 
same year, the federal government will provide only slightly over $3 
million for gifted and talented children. In short, for every dollar spent ' 
on a gifted child for special education. |I00 is spent on a handicapped 
child. • , - 

Is this the appropriate rate of expenditure fpr exceptional chil- 
dren in our society? Probably not. It is representative, however, of the 
political realities that attend OMr present'^system of crisis decision mak- 
ing in government. Gifted children suffer because they arc a "cool," on 
long range, problem. Budget and legislative decisions are made not on 
the basis pf what might be of ultimate benefit^to society but on what is' 
the greatest immediate crisis or what represents the largest political 
pressure. Gifted children may be our best long-range investment in 
education, but they do not create problem? ofl immediate significance; 
nor have they ha4^ vocal constituency capable of extracting attention, 
and dollars from publiq policy makers. > n 

Mitchell an^ Erickson (1978. p. 13) report from a national sur- 
vey on current policies, resources, and servic^eafthat the national picture 
of educational programs for gifted and talented children in 1976-77 is 
slightly better than it was in 1971-72: More gifted and talented stu- 
dents arc being identified iind served; more states Have statutes and 
policy docuriicnts concerning their education; more money is being 
allocated to cdycatrtijal programs for these special chadrcn*; mor<* per- 
sonnel arc being assigned to work in this ^rea; and more training i^ 
available. They concluded, however, that, "Despite the fact [that] 
there is /more of everything' now than theje was in'l972, . . . the 
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United States still falls fat short^f meeting the educational needs of 
this special segment of its populationy^hey also concluded that fed- 
eral entrance irub the issue of educating the gifted and talented did ' 
have one important effect; tliough modest in its fiscal efforts, the gov- 
ernment modified ijncl extended the generally accepted definition of 
the gifted child, . * 

who are the gifted? 

Each culturjC^ tends to define j^tedness in its own image; the 
definition not only fixes thJ? role ofjhe gifted individual in a certain 
culture, but it tells us something about the culture itself as ^ell. What 
would be called gi/fed in a primitive society may be very different from 
what we would honor jn our a^anced technological society. Some cul- 
tures, such as that of ancient Greece, honored the orator, while Rome 
valued the engineer and soldier, and so on. What does the current def- 
inition tell us abgut our own culture? According to Mar^nd (1978,^ 
p. 10): . ^ V 

Gifted arid talented children are those ideiuified by professron- 
, ally qualifieci persons who by virtue of outsitanding abilities are 
capable of high performance. These |re children who require 
differentiated educational programs and services beypnd those 
normally provided by the regular program in order to realize 
their contribution sq^ and societ/. 

Children capable of high performance include those with dem- 
onstrated achievement ancf/or potential ability jn any of the 
following areas: - 

1. General intellectuara^)ility 

2. Specific academic, aptitude 

3. Creative ox productive thinking 

4. Leadership ability 
I 5y Visual and performing arts 
/ 6. Psychomotor ability ■. ^ 

Such 4 definition is a noble attempt to broaden the idea of giftedness 
beyond verbal facility, but it cannot become operational without ade- 
quate measuting instruments and more sophisticated theory. 

^ mpasuRisment irffluences 

After six decades of trying lo measure individuals* characteris- 
tics, we arc now engaged in an attempt to undJ^rstand aftd predict'thosc 
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individiaals' future behaviors and performances. This attempt has 
worked reasonably well in the areas of achievement and cognitive devel- 
opment. but it has worked less well in such areas as creativity andlead- 
ership. Predictions of creativity ^nd leadership 'depend, in large mei- * 
sure, on the nature of the specific environmeijt'in-tThich an individual 
is behaving, as well as on the characteristics of that individual.'Thus, 
individual X may be a potential leader in environments 7, 13, and 22 
but not in 5. 8. or 9. this interactive approach to measuring leadership 
and creativity lacks the decisive ring of saying that someone is a "bom 
leader." but it is prbbably more accurate in the long run (Arnold , 
1977^ Stog^ill. 1974)\ 

Cultural Differences. A special problem is encountered in iden- 
"fyj^g gifted min9rity grbup children who have grown up under differ- 
cnt\;ultural circutnstances than have those children assessed by stan- 
dard IQ measurements. Ijhere have been three general approaclies to 
this problem, to date. Thifirst of these can be called a statistical adjust- 
ment. Mercer (1978) h4/ developed a technique Ijnown as the System 
of Multicultural Pluralfltic Assessments (SOMPA). This system makes 
statistical adjustments for students' actual IQscores based on the pres-' " 
encc or absence of, optimum assessment conditions. According to 
Mercer, optimum conditions are present if ^11 students: (1) have had 
similar opportunities foir learning the material^ and acquiring the skills 
covered in the test; (2) have been similarly motivated by the significant 
other persons in their lives to learn this material and to acquire these 
skills; (3) have had similar ejiperience with taking tests; (4) have no 
emotional disturbances or anxieties interfering with test performance; 
and (5) have no sensory motor disabilities interferingVwith prior learn- 
mg or with their abitity to respond in this test situation. Mercer believes 
that when these factors are held cohstant the pluralistic model assumes 
that the individual who has learned the most probably has the greatest 
learning potential. Use of this technique has been successful in identify- 
ing gifted and talented minority group children who otherwise might 
not have been located. 

A secgnd major approach to identifying gifted rhinority chil- 
dren IS to try to assess with measuring instruments the characteristics in 
ihosc domains that the cultural subgroup.puts particular stress on. In 
this way. one can identify the special talents in different ethnic groups. 
For Instance. Bcrnal ( 1.974) suggests such a test for young Chicano chil- 
dren based on Pjagetian concepts and including the Cartoon Conserva- 
tion Scale developed by DeAvila and Havassy (1975). In another exam- 
pic. Meeker (1978) reports the work of Evelyn Hah n of the Bureau of 
Indian Affairs in identifying gifted Navajo students. By u9H« Struc- 
ture of Intellect tests that arc heavily weighted to figural rathcnhan to 
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semantic areas, Hahn was able to find gifted Indian children. The 
Navajo children tested had particularly high scores for auditory mem- 
ory *but they scored low on classification skills irt th^ figural dimension, 
Navajo is a sparse langxiage with a thinimum of words for classifica- 
tion, and it is learned largely through the auditory sense. Thixs, this 
type of identification provides a basi& for understanding cultural dif- 
ferences as well as for plotting some clear curriculum objectives for 
Navajo children. 

Torrance (1976) reports two types of spedal tests designed to 
identify giftedness in black populations. These tests are **SQunds and 
images/* and **thinking creatively with action and movement. *'*In the 
"sounds and images*' test, children are asked to describe images sug- 
gested by a series of sound effects. Test results indicate that black and 
white children have equally rich imagery storehouses. However, in the 
second kind of test, **thinkin]^ creatively with action and movement,** 
Torrance found that black children responded to problems with action 
and movement while whiteychildren tended to respond verbally, telling 
rather than acting out whatyiey would do. This test allowed Torfance 
.and his coworkers to uSe the\pecially developed talents of the t>lack 
subgroup to help identify its gifted and talented menibers, 

The third major technique tha^ has been used to identify gif^wT 
minority children combines tests, rating scales, and peer and a^ult 
riominations. This approach is.presented in a systematic form by Bald- 
win (1978). She uses eleven different atoessmetit instruments, ranging 
, from standard intelligence tests to peer nominations!* to develop a com-* 
posite score for an individual. The use of multiple tneasures enables^ 
her to find the gifted and talented students within minority groupi 
without unduly penalizing students for poor performance on any one 
of the instruments. 

' The identification of gifted and talented students within minor- 
ity groups has progressed much more rapidly than has the development 
of clear and distinctive curriculum adjustments for th^m. Although 
some suggeitions have been made (Gallagher and Kinney,- 1974), the * 
field still lacks definitive staterpents regarding important <^distinctiv/( 
curriculunpi adaptations for these youngsters (Baldwin^ 1978). 

Creativity. Great interest in creativity was spawned by the the- 
oretical wofk of Guilford (1950, 1967) amd spurred by the imaginative 
application of that work by Getzcls and Jackson (1962) and by Torrance 
(1965). This movement created a blizzard of new measuring Instru- 
ments of dubious validity and reliability. Such simple instruments, of 
coune, did not measure creativity, which is a complex procesS that 
cannot be viewed apart from the subject and the environment. How- 
ever, they did measure ^pme characteristics of intellectu^t fluency and 
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flexibility, wffl^may be more matters of cognitive style than separate 
intellectual operations. They miss the essence of the cqmplex process of 
creativity as noted in the study of the creative person (Birron, 1969). 

school adaptations 

The two major objectives of special education for gifted stu- 
dents have generally been agreed upon (see Gallagher, 19V5): (1) they 
should master the structure of the knowledge disciplines and under- 
stand the basic principles at the heart of their subject matter. They 
should learn systems of knowledge rather than simple facts and associa- 
tions. (2) They should learn the heuristic skills of problem solving, crea- 
tivity, scientific method, and so on, so thatthey will become more 
autonomous learners and not be constrained by the limits of individual 
teachers. A number of adaptations have accompanied efforts }o meet 
these long-range goals. 

Content. During the early 1960s, a brief but exciting marriage 
between scholars and educators -attempted to produce a systematic 
reorganization of knowledge in mathematics, physical science, and the 
social sciences (Bruner, 1960; Goodlad, 1964). Programrf;that were 
developed during this period emphasized the basic structure of a disci- 
pline; stressed the importance of having tKe student behave as a physi- 
cist, a historian, or whatever, artd encouraged the introduction of com- 
plex ideas as eariy as possible in the school program. These are all 
educational goals that' fit the needs of gifted children very well. This 
marriage disintegrated in the lajfe 1960s when^he Vietnam war and 
desegregation took over as^ajor emphases for schools and scholars. 
However, It pointed the way toward a new liaison that can aid the clear 
^presentation of impoftant ideas to giftfed and "talented students. 

Example of how such synthesis of important ideas can,bc acconj- 
plished, as well as verification of the viability of the approach, haye 
been presented by two television series produced \)y the BBC: Kenneth 
Clark's Civilisation (1970) and Bipnov/sii's Ascent of Man (1973). Each 
series tried to take central ideas and major insights and build a set of 
illustrative examples, conceptual linkages, "and consequences around 
them. A few brief quotes^^rom the Bronowski series will illustrate. major 
ideas that are well within the grasj) of the gifted and talehted from pre- 
adolescence cnward. ^ (, 

War. organized war, is -not a human instinct. It is a highly 
planned and cooperative theft. And that form of theft began 
ten thousand years ago when the harvesters of wheat accumu- 
lated a sui'plus ariti the noamds rose out of the desert to rob 
.them of what they themselves cluild not provide (p. 88). 



The different caltiy«<nave used firt for the same purposes: to 
keep warm, (otlrive off predators and clear woodlana, and to 
make simple transformations of everyday life, to cook, to dry 
" and harden wood, to he>it and split stones. But, of course^he 
great transformation that helped us make our.^ilisatioci goes^ 
deeper: it tjiie use of fire to disclose a wholly new class of mate- 
rials, the metals (p. 124). ^ 

Easter Island is over a thousand miles from the nearest inhab- 
ited island. . . . Distances like that cannot be navigated unless 
you have a model of the heavens and of star positions by which 
to find your w^y. People often ask about Easter Island, how did 
men come, here? vThey came here by accident: that is not tlje 
question. The question is why could tljey not get off? And. they 
could not get off because they did not have a sense of the*move- 
ment of the stars by which to find^ their way (p. 19S). 

The horse and the rtder have many anatomical features^n com- 
> niori. But it is the human crc;ature wh%.rides the horse, and not 
the other way abOdt. Thcu'e is nO wiring inside the brain that 
f - ^ makes us horse riders. Riding a horse is a comparatively recent 
' invention — less than five thousand years old. And yet it has had 
an immense influence, for instance, on our social structure. 
Plasticity of human behavior makes that possible. That is whart 
characterizes Vis in out social institutions, of course, artd above 
^ all, in our books, because they arc the permanent products of 
^ the total interest of the human mind (p. 412). 

Such ideas can ^ the baii^'of an e;cciting curriculum if scholars and 
teachers renew their joint efforts and interests. 

Shuts. The earlier notedt a^ventu^cs in search of creativity and 
the creative process have focused attention oa^he thinking process and 
generated some useful instructional programs and materials (Feldhusen 
and Treffinger, 1977; Torrance |nd Myers, 1970). 

^ Leaning Environment. Several innovative admiilistrative devices 
have been adapted in education for gifted^students, such as special 
schools, magnet schools, resoyrc'e rooms, mentors, and tutorial pro- 
grams; they are all designed to create an ehvirqlnment conducive to 
achieving the two major objectives of such special! education. But evai- . 
uating educational programs for the gifted has been difflcuk without 
appropriate measuring instruments, since standard ax:hievemcnt'^e'«s 
leave much to be desired in this rpgard (Rcnzulli, 1976fi Since multiple- 
choice achievement tests must be constructed such that they allow most 
of tlie studenta to respond to each item, thtfrc is no room pn the test for 
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the kind of knowledge ihari only the gifted child might learn or under- 
stand. , ^ 

Consequently, despite the high scores that gifted students obtain 
on standard norm -referenced achievement te^ts, one still may overlook 
their real capabilities. For example, standard achievement tests in his- 
tory stress much factual knowledge and some reasoning ability. Such 
tt^ts may indicate whether or not a youngster 'has necessary informa- 
tion regarding the American Revolution or the U.S. , Constitution, but 
they are unlikely to demonstrate the gifted child s understanding of 
revolution as a generic concept and his or her ability tp apply that 
knowledge to a wide variety of circumstances. Educators would not be 
able to discern from multiple-choice testing of simple concepts in astron- 
omy or physics or from science achievement tttsts that a youngster hps a 
grasp of Einstein's theory of'relativity. It would be coM(|g|rproductiye 
to place sophisticated items like this on a standard iHrtiple choice 
achievement test;- the vast majority of suidents would miss them totally, 
which would create problem^ in test construction and in norming. 
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If there were no interest in doing something special or unique 
foi^fted students, there would be no need to think about better or dif- 
ferent measuring instruments. But if we were correct in ouf original 
assumption that crisis heightens our appreciation of gifted students 
and their needs, then we probably can be confident, looking at our 
immediate future, that the need fpr special programs (and thus for 
special instruments) will be recognized. 

We must discard standard instruments designed for average 
st^udents and develop instruments for special populations and unique 
educational, ojbjectives; a very sfiecial and tittrque type of*«criterion- 
referenced test is needed - One that is designed to measure maximum 
rather than minimum corripetence, In addition, a new set of instru- 
ments would allow us to integrate knowledge <^the individual with the 
classification of common environmental settin|[s and conditions, This 
would hplp us to properly identify students for leadership and creativ- 
ity programs and to find hidden talent in minority groups (i&aldwin, 
Gear, and L(icito, 1978). 

• The chronic absence of research and training money for evalu- 
ating and teaching gifted children has led to a disastrous lack of inter- 
est on the part of universities in this topic. The cthss financial truth is 
that training programs with' frfw enrollees. such as education for the 
gift^d,^ cannot pay for them8elvcl%nd must have external support if 
uni^yersities aice to become involved! And since tnost itmovaitive ideas in 
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education still come eitb<?r from universities or fr6m the research coip- 
m^nity, some degree of incentive must be provided if we are to see dra- 
matic innovation^ in programs for gifted children in the near future. 
Moreover-, there Is a laclc of any deliberate policy to enco\ira()^e the 
development of ne^instruiftents for special purposes. Jnstru^ient devel- 
opment is not considered an iappropri^te use of limited researc^h dol- 
lars. Unfortunately, research cannot con^pete politically with service 
for scarce funds. Service programs provide direct benefits to their con* 
stituents and create instant political rewards, but research may not 
have mVastfreable impact until several generations of politicians have 
passed by. One solution that has become* more and more'popular 
dmong scientists concerned with public support h'a^ been proposing 
that a fixed percentage of service funds be set aside for research and 
development. In this way, research and developAient activities would 
becorpe political beneficiaries of pressures for increased service. As 
Gallagher (1975,-p. 26) say&; ' ' . 

One alternative to current operations would link priority pro- 
grams to some sli,ding sfaK?*related to general education expen- 
ditures. For example, educational research and developVnent 
could be tied to educational expendftures and receive five per- 
cent of the^fal, whatever that total is. The more nioney spent 
on educational service*^ the more money would gt) to research, 
and at a percentage level shown to be effective in fields such as 
agriculture and health. This would eliminate the temptation 
for budget ciftters looking for lost dollars to attack a program 
^ whose nature makes it more defenseless than programs with 
y . strong emotional support, such as programs for services to the 
handicapped. 

. ' • • . * ^ • 

Other observers of the federal scene have proposed' similar schemes. 
roT txam(pl^, Challoner (1974) has suggested- the formation of a bio- 
medical research trust fund ^hat wciuld be tied to the gross revenues of 
the health industry or perhaps to ? percentage of health insufSnce pre- 
miums. And Krathwohl (197 7 J has suggested that a fixed percentage of 
the federal education allocation go to educational Vesearch and devel- 
opment. Unless some such systerYi-wfde strategy is adopted that will 
allow long-range goals of great merit, such as the education ^gifted 
^tudeWs or the discovery and development of new ideas in measure- 
ment to be supported or (Underwritten, we must continue to liv/ with 
» the loss than optimum level of suppoh that now exists. 

We iri education seem to support the philosophy thUt new mea- 
suring instruments appear as if, by magic - perhaps thrtjugh a firm tap 
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by a fairy godmother. Maybe federal agencies arc afraid they will be 
attacked for trying^ to subtly influence the curriculum frpm a national 
standpoint. Whatever the reason, education for the gifted and talented 
has suffered substantially from having to put on a^uit of measurement 
clothes that neither Gti its needs nor measures its intellectual breadth. 
Obvibusly, the gidful siAn dasignatcii in the federal budget for educat- 
ing the gifted^an^ talented is not sufficient for research, training, Or ^ 
instrum<!m development. may hav<; to rdy on private sources, such 
as found^Wflns, Jor the'suftesmanship and the foresight needed to sup- 
P9jUi^ rxlea^urement 'and program innovations. Science flies on the 
,5C7--y^ings of.'^ts measuring irjstruments; until we, as a nation^ recognize 
and act On that fundamental concept, our vision of what is possible for 
the gifted studcQt will be limked by our owp inadequate insfruments. 
The prevai^ng viewpoint of those who support special programs for the 
gifted is ^mmed up in a quote from Arnold Toynbce (1968, p. 24): 

The creator has withheld from tnan the shark s teeth, thp bird> 
wings, the elephant s trunk, and the hound's or horse\ racing 
feet. The creative power planted in a minority of mankind has 
t6 do duty for all the marvelous physicaWssets tha^are built into 
every specimen of pan s nonhuman fellmv creatm'es. If sdtiety 
fails to make the most of this One human asset, or if, worse still, 
it perversely set« itself to stifle it, man is throwing away his birth- 
right of being the lord of creation and is Condemning himsetf to 

be» instead, the least effective species on the face of this'planet.* 

• / t 
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Increased understanding of the complexities of 
bilingual education is yielding better tests 
and more effective use of tests results.^ 
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testing and 
bilingual education 

maria medina swansOn . 



Millions of students in the United State comq from homes in which a 
language other than English is spoken. A growing awareness of their 
special problems has led to the enactment of numerous federal and 
state laws affecting the education of such students. The impact of thes« 
changing educational pblicies on instructional programs, as welllili On 
educational and psychological measurement, is being felt across the 
country. Yet the need to properly identify and diiignose 'the specific 
linguistic and jeducttional needs of these non-Epgysh-speakihg stu^ 
dents in order to provide meaningful educational experiences |pr them 
while they arc learning English remains a crucial issue for those 
involved in bilingual education. 

Throughout the hilitory of the United States, there have always 
been students for whom English is a second ldi)^age. And throughout 
that time, except for a period beginning in the late 1890s and endihg 
in the mid-1960s, many of these students have been able to enroll in 
schoob that offer instruction both in their native language and in 
English (Lcibowitx, 1978). ''An estimated one milliorx childi^n 
attended bilingual program* in public schools during the nineteenth 
century, not to mention the continuing tradition wlvhCfV started even 
earlier inisectarian schools'' (Zirkel^ 1978, p. 48). Such progtlims wero 
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availablc^l for cxamplCp in Spamsh-^j{|HHubric schools in New Mex- 
ico. French- English schools in toliisiana, and German- English schools 
in several lAidwestern and' northeastern states. 

Toward the latter part of the nineteenth century, however, dtie ' 
to a combination of circuirutances (none of which had any^thing to do 
with educational needs) — mcreasing immigration, religious and ethnic 
prc^dicc, and nai^ionalism — a wave of laws prohibiting instruction in 
any langliage other than English in puBlic and even private schobis 
spread from^^state to state. Tbis*^ attitude was compounded by our 
involvement in World War I; cmring thostnyearsrwc pushed the xcno^- 
phobic panic button. tit was absolutely verboten to speak Germain, arid 
speaking any other language was considered suspiciously un-Ameri- 
can. Some states went sq far as to levy a fine against anyone overheard 
speaking German, in a public place. Other states tried to ban foreign 
language instruction altogether. The effects of this hypernatioiialism 
were far reaching: by 1925, thirty-four states had statutes requiring 
English to be the only medium of instruction in public schools. Its 
impact lasted well into the sixties, though we still can see some vestiges 
of it today. ^ 

In the sixties, our country finally began to awaken. The Civil 
Rights Act of 1964 made us more aware than ever before of racial and 
ethnic minority groups and their needs, the many deprivations and 
injustices they suffered, and their emerging political strength. The 1960 
census revealed a phenomenal growth among the Mexican -American 
population in the Southwest, which by then accounted for 12 percent 
of the total combined population of Texas, California, Colorado, 
Arizona, and New Mexico. In New York and other northeastern states, 
the influx of Puerto Ricans and whcr Hispanic immigrants was also 
cause for concern. Federal and st&tc governments began to respond to 
this growing cDnstituc;ncy: the Equal Employtnent Opportunity Com- 
mission studied employment patterns, the Civil .Rights Commission 
examined legal rights. Congress suspended English literacy require- 
ments for voting, and a number of states looked intaeducational issues 
affecting the different minority groups. 

Linguistic minorities began to speak out as well. They were 
understandably dissatisfied with the failure of the educational system 
to meet the needs of their children. In far too many instances, sct^ools 
would automatically place students with limited English proficiency in 
classes two or three grades below tht^r age group, hoping to make it 
easier for them to catch on to English. The results were usually more 
damaging than beneficial. In other cases, such students were placed 
with low-ability groups at the elementary level ^nd/or channeled into 
vocational programs in junior and senior high schools. In addition. 
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unwilling to^cccpt the idea that in order to succeed t^iey mqst give up 
their culturaUnd linguistic traditions, ethnic cpmmunities throughout 
|he U.S. began demanding the kind of ii^struction that was responsive 
to their needs -bilingual education. Their rationale was simple and 
straightforward: build upon children's strcfngths by teaching them in 
their (Wj^languages while they learn ^Inglish. And they had the special 
inccntt^of knowing that such programs were indeed feasible: the suc- 
cessful bilingual program implemented in 1963 at the<:oral Way School 
for Cuban refugees in Dade County, Florida, had served as a model for 
a few innovative schools in ^he Southwest and had helped popularize 
the concept pf bilingual education among ethnic communities. % 

The educational community also became involved, in the quest. 
For example, in the National Education Association sponsored a 

conference on the education of Spanish-speaking children ^ind in its 
report strongly recommended bilingual instruction. Other ^groups 
reached similar conclusions and recommended involving the federal 
government. Thus, the road was paved for the Bilingual Education Act. 

Title VII: the bilingual education act 

In 1968, Congress took positive steps to help children who could 
not understand instruction in English. Title VII (the Bilingual Educa- 
tion Act) of the Elementary and Secondary Education Act (cited in 
Schneider, 1976, p. 172) included this declaration of policy: 

In recognition t)f the special educatitrfiaTnceds of the large num- 
bers of children of limited English-speaking abilty in the United 
States, Congress hereby declares it to be the policy of the United 
States to provide financial assistance to local educational agen- 
cies to develop and carry out new and imaginative elementary 
and secondary school programs designed to meet these special 
educational needs. For the purposes of this Title, **childrenof 
^ limited ^nglish speaking ability'* means children who conie 
from environments wherq the dominant language p other than 
English. 

At last,tlhe **sink or swinx**- approach^ which had contributed to a high 
dropout rate among Hispanics and students from other linguistic 
minorities, was recognized i% ineffective and the English only policy 
was overruled. This was truly landmark legislation. In order to provide 
'*new and imaginative^ programs, it authorized such activities as: 
(1) bilingual education programs; (2) programs designed to teach stu- 
dents about the history and culture associated with theii: languages; 



ERIC 



26 % 

(3) efforts lo cstabli^ closer cooperation between school and home 

(4) early childhood education programs; (5) adult educationprograma 
(for parents of students); (6) programs for dropouts or\)Otent' 
dropouts in need of bilingual instruction; and (7) programs cb#ducted 
by a|^reditcd trade, vocational, or technical schools. . ^ ' ' 

Shortcomings. Title Vll also authorizeti planning grants, 
research grants, and pilot projects to test the plans as well as ^he devel 
opment and dis;;(emination of the bilingual instructional material. And 
funds were made available for preservice and inservice training of a 
Variety of instr^uctional and ancillary personnel (Schneider, 1^76), 4^he 
act, however, had a few shortcomings. Most noticeable among them 
was the absence of ' a definition of bilingual education. This was 
remedied in the manual published by the Office of Education (U.S. 
Office of Education, 1971)v 

Bilingual Education is the use of two languages, one of which is 
English, as mediums of instruction for the same pupil popula- 
% tion in a well-organized program which encompasses part or all 
pf the ^curriculum and includes the Study of the history and cul- 
ture associated with the mother tongue. A complete program 
develops and maintains the children's self-esteem and a legiti- 
mate pride in both cultures. 

Another shortcoming was the ''poverty clause** requiring that 
participating students be from families that earned less than $3,000 
annually or were on welfare, I his limitation was removed in amend- 
ments made in 1972. But What was actually the greatest drawback of 
all was the general lack of experience of all personnel involved in 
implementinj^the Bilingual Education Act and the scarcity of outside 
experts to provide the necessary tecf)nical assistance. In order to imple- 
ment the kinds of prograins called for in the act's guidelines, personnel 
would have to be able to conduct linguistic and educational needs 
aitsessments, population studies, and community surveys; design and 
plan programs, including long range goals and five-year program 
objectives; design instructional components with process and product 
objectives in -first and second languages, content areas, and culture 
and heritage (including procedures for evaluation, data collection, 
analysis, and reporting); acquire, adapt and develop instructional 
materials for student use as well as training materials for staff develop* 
ment; design and conduct a staff development program for teachers, 
paraprofessionals, and support personnel; conduct a j^rogram evalua- 
tion outlining behaviors to be measured, instruments to be used, 
methods of data collection, and methods of analysis; and involve 
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{Barents and community in school activities. and advisory councils and 
design adult programs for them. In addition, these personnel would be 
responsible fAr tea^rhing students, grading papers, and supervising the 
lunchroom. ■ <^ 

It is not ijurprising, then, that in 1970 when the Office of Edu- 
cation colhmissioned the Rand Corporatioo to conduct a st^idy of sev- 
eral of its programs, the findings shoWed thait bilingual prograiiis were 
the hardest to implement. **Title VII began with the fewest available 
resources and the least developed program strategy'* of any qf the pro- 
grams, the study added (Andersson and Boyer. 1978. p. 40). The 
implenientation problems were attributed to inadequate materials, 
unrealistic goals, impossible schedules, and an overburdened staff. 
Although the study's authors acknowledged that the relative newness 
of bilingual education may have been primarily to blame (the study 
was made only one ywr after the start of the bilingual program), they, 
also observed \hat.thcbhanges attempted by some projects may have 
been too ambitious. Lack of experience may have accounted for a slow 
and rather painful beginning; nevertheless> the dedication and enthu- 
siasm of the professionals committed to the philosophy of bilingual 
education resulted in continuous efforts to improve all aspects of the 
program. ' 

By 1973v third- and fourth-year bilingual education programs 
showed substantial progress in program design and instruction; selec- 
tion and development of materials, teacher training, and community 
national projects had been established to provide services for bilingual* 
instructional programs. For example, the Materials Acquisition Proj- 
ect identified and evaluated published materials fdr bilingual instruc- 
tion, and the Dissemination Center for Bilingual Bj|^ltural Education 
(DCBBE) published* and disvuibuted selected project-dqveloped mate- 
rials; in this way, some of the initial demands of individual projects for 
development of materials were met. Progress had also been made in 
identifying achievement, lan^age dominance, and language profi- 
ciency tests that could be used in bilinguah programs. Many of these 
tests had been developed specifically for bilingual students. An anno- 
tated bibliography listing seventy-nine project-developed instruments 
available from noncommercial sources was published by^ the DCBBE 
(Diiieminath)n Center for Bilingual |^cultural Education. 1975). 
^ Assessing Students' Eligibility. The years between 1968 and 
1974 made up an important learning period for bilinguaj^ educators. 
The method of identifying students eligible to participate in bilingual 
programs went through a series of developinental stages. At first it was 
not uncommon to find /itudents l|[ping diagnosed as limited English- ^, 
speaking and thus needing bilingual education simply on the basis of 
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their surnames. In other cases, such placement w^s based on nothing 
more than tcachirs* opinions about students language dominance as 
indicated byvrfieir classroom performance 'in English, Dissatisfaction 
with these assessment prqcedure^ led to the useof a combined approach 
consisting of (1) a questionnaire designed to determine which language 
students used at home, with their peer«, on the playground, and so on; 
(2) a language dominance test (an oral int<;rview) during which stu- 
dents were asked to answer questions or to tell stories about pictures or 
objects in both their native language and English (dr they might -be 
askf^l specific' questions about their homes, families, and schools, or 
otherwise <?ngaged in conversation in both languagies); (3) input frori! 
teachers; and (^) direct observation of students by the evalualDrs. 

Although the combined approach generally resulted in ade- 
quate determinations ot language dominance, educators eventually 
realized that language dominance and language proficiency were two 
different things apd that, although language dominance determineHsa 
student's heed for' bilingual instruction, it told very little about tfie 
degree of that student's proficiency in either language. For instance, a 
third grade student transferring to the school after completing the first 
and second grades in Puerto Rico is obviously much more proficient in 
Spanish than is a third grade student \vhose Puerto Rican parents 
speak Spanish at home but who h* struggled throu^ the first and 
second grades using English in the United States.'^Both students are 
Spanish dominant and both have limited English language proficiency; 
however, the first iias a relatively rich and extensive vocabulary and 
can read and write in Spanish, whereas thd second, although well- 
versed in conversational Spanish centering bn family and neighbor- 
hood topics, has Tiad far less linguistic experience than has the first.. 
Thus, teachers soon learned that a class of thirty-five Spanish-dominant 
students could vei7 well mean a class with anywhere from one to thirty- 
five different levels of proficiency in Spanish and just as many different 
levels of proficiency in English, resulting in Excedrin Headache Num- 
ber 70 for the teacher. Curriculum planning, materials selection and 
adaptation, and instructional approaches and techniques had to take 
thes<; individual differe?fces into account. Qualified teachers had to be 
able to not only teach content areas in two languages but be masters in 
individualization, small -group instruction, materials adaptation, diag- 
nostic procedures; and above all else, they must be warm, sensitive, 
perceptive, and flexible. 

• Standardized Testing. The complexities that diverse levels of 
language proficiency brought to the classroom were compounded in 
the area of standardized testing. The need^ develop instruments in 
the language of .the students proved to be a very complicated under- 
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taking. Translating existing English language tests proved unsatisfac- 
tory because peqple of .different ethnic and linguistic bac^kgrounds' do ^ 
not think in the same way, structure thoughts in the same manner, or 
leara equivalent words and concepts in th^ same order. A word or con- 
'cept that is c6mmon and therefore considered easy among English- 
speaking children may not be at .all common or even exfst in the same 
, form in another language., One example of this problem is the En|[lish 
word pet (DeAvila and Havassy. 1978). There is no such word in Span- 
ish. The usual translation is animal domestico or maj;co<(lp depending 
on the meaning. Both of these concepts arc considerably more compli- ' 
cated thstn the English pet. 

Obviously, special tests had to be developed for these students. 
Psychofbgists, consultants, evaluators, teachers, project directors, coun- 
selors—anyone who had a good idea — began developing tests during 
this period. Even one or t\vo commercial publishers decided to give ij a 
try. Additional problems soon surfaced; regional differences, botbclin- • 
guistic and cultural; lack of reading skills in the native^ language; and 
gaps of proficiency in the native language that, in many 'cases, were 
filled in the second language (English). Thus, testing a fourth grader's 
achievement in sciei^ce, math, or social studies, fof example, may have 
required giving instructipns, questions, ^nd answers in both the stu- 
dent's native language and English. And a psychological evaluation 
had to consider the possibility that a child might know some things 
only in one language and others only in the other language. 

state involvement 

When the Bilingual Education Act (Title VII) was enacted in 
1968, twenty states still prohibited instruction in a language other than 
English. However, its passage brought about a surge of activity in state 
legislatures across the country. They passed laws to lift restrictions 
against the use of other languages, laws to allow bilingual instruction, 
and laws that appropriated moneys for bilingual programs. A number 
or states adopted laws requiring psychological evaluations in the child's 
native language and prohibiting any placement of children in special 
education classes until such assessment had been made. In 1972, Massa- 
chusetts became the first state to require bilingual education programs 
in all schools with twenty or more students of limited English speaking 
ability. Soon Texas, California, Colorado, New Mexico, and Illinois 
followed with similar mandates. By 1976, ten states had statutes mak- 
ing bilingual education mandatory; sixteen states specifically permitted 
it; fourteen states had no statutes but tacitly allowed it; and ten states ^ 
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still prohibited it in some form or other, although spir\e of them man- 
aged to have Titl^ VII programs in spite of such regulations (Develop- 
ment Associates. 1977). ^mong. tlje factors that accounted for state 
involvement in bilingual education were the nurUber of linguistic min- 
ority students residing in the state, the degree of political activity of the 
ethnic community, exposure to bilingual education through Title VIl 
programs, ar\d the level of awareness in the state about the need for 
and the implementation of bilingual education. 

State involvement* really intensified the flurry of activities sur- 
rounding bilingual^education. One reason for this was that the state 
requirements were considerably more specific than were the'federal 
ones. Jn the study of state progranis cited earlier, it was noted that by 
1976 seventeen states defined bilingual education as "transitional*' 
a temporary bridge to help students process into an all-English cur- 
riculum. Thus, it became crucial to develop testing procedures for 
placing students in bilingual programs, for measuring, students' con- 
ceptual growth while in^hose programs, and for assessing English lan- 
guage proficiency to detemLine when students could move into mpno- 
lingual (English) classes. Thirteen states ^lad bilingual certification 
requirement!^ for personnel teaching in these programs. As a result, 
teacher preparation institutions and state certifwfetion boards were put 
to task to determine what specific knowledge, skills, characteristics, 
and competencies a bilingual teacher needed. Thirteen states included 
in their bilingual program^ a cultural component recognizing both the 
importance of self concept and self-esteem in scholastic success and the 
need for schools to be sensitive to cultOral differences ii\ student behav- 
ior as well as in learning styles. Eleven states required strong parental 
in olvement, stressing the importance of the home environment ika 
part of the total educational experience and the need for the school t^ 
nderstand the sociocultural context in which students are raised. 
Thirteen states appropriated funds to implement programs. This 
tjj^ought about the development of a variety of program models that 
were appropriate to the particular needs and characteristics of the pop- 
ulation to be served. 

Lau v!}^ichol$: a landmark decision 
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In January 1974, the U.S. Si^^eme Court decision in Lau v. Nichols 
brought national attention to the educational needs of students of lim- 
ited English-speaking ability. In this tase. Chinese public school stu 
dents claimed that the San Francisco Unified School District was hot 
pro\^idi'ng them with equal educational opportunity. The court ruled 
in favor of the plaintiffs, stating that the district's failure to provide 
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progrims to meet the linguistic needs of the students violated Title ll 
of the Civil Rights Act. Adding that equal educational opportunity 
goes beyond providing the same buildings, books, or teachers, it main- 
tained that because these students could not understand the language 
of the classroom, they were, in effect. deprivec|^)f a minimally ade- 
quate education (Tciielbaum and Hiller. 1977). 

(ifr Though not expressly endorsing bilthgual education, the Lau 
decision legitimized and gjive impeti^s to the movement for equal edu- 
cational opportunity for students of limited English proficiency. It 
brought the needs of those students to the attention of every <listrict 
receiving federal aid. It set in motion efforts to provide^federal enforce- 
ment, as well as technical assistance, through a network of regional 
centers. And it raised the public consciousness of the need for bilingual 
education^ thus aiding the passage of state mandates. The Lau ruling 
also raised many questions that have become very familiar in bilingual 
education. How many target student? must there be? Must they be con- 
centrated iu*ra few schools? What docs limited English-speaking ability 
mean? Is a student from a linguistic minority who can speak and under- 
stand English but who reads below level and underachieves in content 
areas included in LauT What arc appropriate remedies? May schools 
choose whatever program they feel is adequate? Is^icultUral ed|gation 
required? What about school desegregation? 

Lau Remedies. Following the supreme court decision, the U.S. 
Office of Civil Rights asked all school districts receiving federal funds 
to conduct a language survey {o identify studertls^of non-Eriglish back- 
ground; this survey subsequently identified over 300 districts that were 
not in compliance with Lau. Thp immediate issue was, of course, how 
to go about getting these disiricis io comply. A set of gqidelines called 
Lau Remedies was developed to provide guidance t^ School districts in 
assessing students' language development as well as in determining 
adequate educational programs for tKem. After assessing the students 
home or primary language, the districts were required to assess each 
student's degree of linguistic function or Ability and place him or her in 
one of five categories (see DeAvila and Duncan. 1976): \ 

A. Monolingual speaker of the lan^age other than En^ish 
(speaks the language other than Ertglish exclusively) 

B . Predominantly speaks the language other than English 
(speaks mostly the language other than English but speaks 
some English) , 

C . Bilingual (speaks both the language other than English and 
English with equal ease) 

D . Predominantly speaks English (speaks mostly English but 
some of the language other tljan English) 

E . Monolingual speaker of English (speaks English exclusively) 



Of these categocies* only A and E are relatively easy to identify; the 
others present a problem. One is '^struck by the loose manner in which 
these levels are defmed. As such, they bear no resemblance to the 
'Operational definitions* . . . given in terms •'of concrete operatioi^, 
such as scores on tests, ntimbcrs of items passed o^nd so on'' (DeAvila 
and Ddncan, 1976, p. 247). For example, catcgci^ C can really cause 
problems. The term bilingual can be defined in many ways: native like 
control of two languages, ability to use two languages alternately, pos- 
session of at least ^one of the four basic skills- understanding, speakings 
reading, writing — in two languages, and so on. In fact, according to 
linguists, there are many kinds of bilingualism. Bilinguals are often 
referred to as bjilanced (or unbalanced?), coordinate ot compound, 
• nat^ral or artificial, bilingual or pseudo-lingual, depending on either 
how they acquired the languages or how well they command them. 
On^ could assume that those who fall into category C are a homoge- 
neo^is group with native-like proficiency in both languages; in reality, 
however, a child limited in both Etiglish and his or her native language 
could very well fit into this category since he or she would speak both 
languages Avith equal ease (or difficulty). Categories B and D are 
extremely vague. Since no official definition was offered for predomi- 
nantly speaks, it was left up to the districts to decide. 

In some states with mandates, similar but somewhat more 
explicit categories had been developed for identifying students requir- 
ing bilingual instruction. In Illinois, for example, the levels of language 
fluency were defined as follows (Illinois Office of Education. 1976): 

1. The students does not speak, understand, or write English, 
, but may know a few isolated word|>or expressions. 

2. The student ^mderstands simdbientences in English, except 
isolated words or expression! 

3. The student speaks and understands English with hesitancy 
and difficulty. With effort and help, the student can carry 
on a conversation in English, understand at least parts of les- 
sons, and follow simple directions^ 

4. The student speaks and understands English without appar- 
ent difficulty but di^lays low achievement, indicating some 
language or cultural interference with learning. 

5. The student speaks and understands both English and the 
home language without difficulty and displays normal aca* 
demic achievement for grade level 

6. The student (of non-English background) either predomi- 
nantly or exclusively ^peak^ English. ^ 

Whereas the Lau categories emphasize language dominance, these 
describe students in terms of English language skills and proficiency as 



♦ observed in a school setting, as well as academic achievement. How- 
ever, this, too, only serves to deternline whether or not a student needs 
bilingual instruction. Once that is determined, assessment of the stu- 
dent's proficiency in the native language is essential in order to pre- 
scribe ^appropriate instruction via .the mother tongue: 

Resultant 'Developments, The ufgeo<;y of complying with the 
Lau requirements has led some districts to develop useful assessment 
instruments and procedures. Chicago s Functional Language Survey, 
for example, includes fifteen items designed to assess the ability of the 
linguistip minority students identified through the state-mandated cen- 
sus to use the English language. The first five items test the studetit s 

* ability to repeat sentences said by the rater at normal conversational 
speed (for example: **I often play with my friends by the fence."). Stu- 
dents are scored on a five-point scale for each itenl according to accu- 
racy, completeness, and promptness of response. The next five items 
assess the student' comprehension and elicit verbal responses. (These*^ 
items'might include, for example, "Tell me how to play your favorite 
game.!') The students are again rated on a five-pojnt scale, this time on 
the basis of comprehensiftn, meaningfulness of response, sentence 
structure, elaboratioji, and vocabulary. The last five items do npt 
require testing but are based on student^ past performances. The rater 
is asked^to indicate how a particular student would perform five task* 
(such as repeating the class homework assignment tp English mono- 
lingual peers who were not present whcft*/ it was given). The fater's 
answer is to be based on the student's oral language performance on 
the previous test or in school during the past year. After adding all the 
raw scores for these fifteen items, each student is categorized as Level I, 
II. Ill, and so on, according to his or her total score and his or her age^ 

The San Diego Observation Assessment Instrument, which.was 
also developed to comply with Y.au requirements, was recently adopted 
by the state of California to satisfy the requirements of the Chacon- 
Mosco^e Bilingual Education Act, AB 1329 (Comcjo and Nadeau, 
1978). It is made jiip of (1) a home language survey; (2) a language 
observation assessment; and (3) a final assessment. The home language 
survey consists of four questions (in English afid in the home language) 
addressed to the parents to determine which language a student teamed 
first as an infant, which language the student presently uses in the 
home, which language adults use in the home, and which language the 
parents use more frequently with their children. The language obser- 
vation assessment consists of an interview conducte(^ by a trained bilin- 
gual in which a student chooses from a set of '^action"*' pictures and 
answers a series of open questions asking him or her tp list objects in the 
picture, tell what is taking place in the picture, and expand conversa* 
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^opAUjuuub« picture. (Sttth questions might include. "What does this 
-•lake you think of?") On the basis of the responses given, the student is 
•cored at LeVel I (lists), Leyel U (tells about), and l.evel III (expafids). . • 
^ In addition, each level is further scored as G = Comprehension. MP 
^ ^ ^Minii^u^l Production. FP = Full Production, and P = Productio^; at 
tcvel III. The final assessment represents the composite estimate of 
-proficiency in the home laj[\guage and the degree ot English fluency 
. demonstrated during the interview. Students are then placed in one of 
the following catfcories (Cornejo and Nadeau. 1978): 

, I. Non-Englijil/speakirtg / • Lau classification A *' 

2. Limited English speaking a Lau classification B 

3. Bilingual 'Lauclassification C 
'4. Limited other lAiguage . • ^aw dassifrcation D - 

5. English only « classification E 

/ 6. fc^ixes languages in both interviews Special 
7. No response in either language Special 

' - ' " ^ ' ■ > ■ .'-1 

Only students placed in categories 1 . 2. and 6 .qualify for bilingii^ pro- 
ye*. grams. However, secondary students falling in the bilingual cat^rfory " ' 
but scoring below a district's predominant percentile are reclassified as 
,Iimited English spealting (LES). Students classified' as "limited otjier 
language " and "bilingual" also qualify for bilingual instruction if their 
scholastic achievement is Ibw. ' 

The Chicago and San Diego Language Assessment Instruments. 
as well as the New York Language Assessment Battery (which responded 
to the mandate oj AS PI R A Consent Decree of 1 974 for improved awess- 
ment of effectiveness in English and in Spanish (Tilis.' Weiciess. and 
Cumpo," 1978). are designed only for determining whether or, not stu- 
dents should be placed in bilingual programs. These are administra- 
tive tests developed in response to legal mandates. Their purpose "suits 
adifninistrative needs rather than pedagogical ones" (Shuy. 1978, 
' p. 326). They help determine the number of students that belong in a ' 
given program, but they offer "no hint as to. what to do about teaching 
them." No )vonder teachers complain. Needless to say. language assess- ' " 
ment for placement is just the tip of the iceberg. Still needed are lajo* 
guage proficiency njeasures for determining treatment procedures to 
be used in the program. There is alsp a need to determine what really 
matters in terms of language proficiency- the more quantifiable and 
testable f^ures (such as pronunciation, vocabulary, and grammar) or 
those that are less qualifiable and testable (such< as semantic meaning 
and functionad meaning). Shuy argues th^i fyncfional use of the lan- 
guage is rtiore critical for effective participation than is knowledge of 
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the language forms in themselves. A student's ability to seek clarifica- 
Uon from the teacher for some itep is far more important to his learn- 
ing than are native-like pronunciation and grammar. More must be 
learned about cultural diff<?rences and how they affect learning styles— 
and hence about needed teaching approaches (Cazden alad Leggett. 
1976). Appropriate bilingual- program models, as well as instructional 
and testing materials for diverse group3 and circumstances, need to be 
further developed. i 

amendn»nb 

The amei)dment5 to the Bilingual Education Act (ESEA Title 
VII) made in 1974 addressed many of the needs identified during the 
implementation of the initial legislation; among other things, they 
sought a definition of bilingual education, developWnt of bilingual 
teacher-training programs at the university level, and preparation pro- 
grams for bilingual paraprofcssionals. 9dminNk;^ators, counselors, and 
other support personnel (Schneider, 1976). ^ater stress was placed 
on capacity building, or "a strategy to provide local school districts 
with the human arid material resources needed to operate bilingual 
programs '(Molina, 1978. p. 23). Since 1974, hundreds of colleges and 
universities across tht country have begun preparing bilingual teachers, 
the number of graduate programs at the master s ai^d doctoral level 
have multiplied, a netWork of support service centers— training resource 
centers, materials development centers, and dissemination and assess- 
ment centers -has been Established to help train classroom personnel, 
provide them with.needid curriculum materials, and assist them with 
all aspects of implemerjfing bilingual education programs. In an effort 
to help with coopiiriCijon and to provide technical assistance, funds 
were allocated Tor departments of education in the states in which Title 
VU programs operate. The need for research in bilingual education 
was also finally addressed; for the first timr; substantial funds were 
allocated for this purpose, as well as for the establishment of a National 
Clearinghouse for Bilingual Education to collect, analyze, and dissemi- 
nate information about bilingual programs. 

^ conclusion 

" The4e recent efforts in capacity building are beginning to yield 
results. The expertise ami professional preparation of bilingual educa- 
tion personnel have cMhiged greatly from the gut-feeling, common 
sense approaches of the early seventies. The increasing understanding 
of the complexities of first and second -language acquisition and theit 
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ImpliciUions for diagnows, placement, and trrtitmcnt of students is 
beginning to yield better instruments, instructional approaches* artd 
materials. And the growing body of highly trained researchers special- 
izing in bilingual education is beginning to provide meaningful and 
responsible studies and evaluations and thus to counteract the effects 
of incomplete and improperly conducted attempts in the past. In 
short, we ve come a long way in assessing the educational needs of stu- 
dents for whom English is a second language. We have an even longer 
way to go. 

'f • 
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C§rtam ways of using tests as an tltment m allocating 
educational funds are gaining substantia} acctptanct. 
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testing and funding; 
a policy context 

, joel s. berke^ 



The 19708 have been a timevof great change in the way America finances 
it! public schools, particularly at the state level (see Berke and Mosko- 
witx, 1977). One aspect of this change has be<n a greater role for the 
state in school finance, a development that has been advocated since 
the turn pf the century, begirining with the work of Elwood Cubberly. 
The stimulus in the seventies came primarily from judicial interpreta- 
tions of state constitutions: these judicial decisions required states to 
change their finance mechanisms to provide greater equity, greater 
equality, greater equality of opportunity, or more thorough and effi- 
cient education, depending on the particular state clause-being inter- 
preted. As a result, changes occurreU both in the way states and local 
districts raise revenues for education and in the ways they distribute 
those revenues. The issue of raising revenue for education can be dealt 
with briefly, at least as far as tests are concerned, because tests have 
not been employed in raising revenues for education. Hdwever. the 
fact that^ revenues for education vary among local districts in each state 
in direct relation to the avliila|)ility of taxable property remains an 
important iisue. Thus, (he central problem on the revenue side boilf 
down to how to break the link between the availability of taxable prop* 
erty and the amounTbf money that a local community has for its schools. 
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One approach to lolving this problem aasumes that all dUtricts that 
have chofen the same tax rate vfill receive equal funding. This outcome 
may be accomplished by establishing a guaranteed tax base program, 
by equalizing district power, or by employing various other technical 
approaches. Most recently, and partly as a response to Proposition 15, 
we are fmding even more interest in systems that tonsure greater partn^ 
tal choice. Vouchers are under discussion again, and there has been 
much debate ^about tax credits at the federal level. 

The other major kind of change in school fundirig involves dis- 
tributing revenues, ^o related issues in this regard are being addressed ' 
at the present tim<^ in states throughout the country, as well as at the 
national level. One issue is how to assure that the resources devoted to a 
child's education do not vary according to where he or she happens to 
live within a state. How can we break the tie on the spending side be- 
tween the district a youngster lives in and how much is spent on his or ^ 
her education? State efforu to solve this problem have led to systems 
designed to cut dovm on disparities in spending among districts. Some 
states have attempted to bring up low*spending districts to a higher 
level by enhancing the state funding guarantee level, or by increasing 
,the share of funding provided by the state. But there is also a second 
issue on the spending side of the ledger: , how can we ensure a better 
match between a youngster's need for educational resources and the 
resources provided to him or her? It is this issue that brings tests into 
the picture. 

testing and federal resource allocations 

The federal government, I think, is primafjly responsible for 
building this sort of concern into state funding systmis!^In the 1960s, | 
the focus of federal aid shifted toward equality of opportunity in an^ 
attempt to overi^ome %e disadvantages that some youngsters brought 
with them whdn they came to school. Title I of the fllementary and 
Secondary Education Act is probably the most prominent efforJ; in this 
direction. This program uses the number of children in poverty as the 
chief detenninant oif state, local district, and school attendance area 
allocations. To determine which children in a target school (a school 
eligible to receive Title I funds) will actually participate in the pro*- 
gram, the criterion shifts to educational need. Tests, as well^s other 
measures, are used to determine which particular pupils will benefit 
most from the school's Title I allocation. 

Since thif early 1970s, however. Congressman Quie has led an 
effort to move the test component of resource allocation (in other w6rds, 
the educational need component) upward in the allocation chain; 

id 
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thu8» in addition to its role in program participation decisions, testing 
would be used in designating target schools and in redetermining allo- 
cations for the district and the state. Over the last three years, the 
National Institute of Education (NIE)> following a mandate by Con- 
gfress. studied the feasibility of this approach. But it iS|Clear that any 
determination of whether such a shift should be niade is essentially a 
value judgment. That is. the NIE study could deal with technical ques- 
tions, it could estimate costs, and it could even run some experiments 
in permitting school districts to use tests for fund allocation to schools. 
It could no^ however, resolve the basic question of whether or not it 
was appropriate jto shift from providing funds on the basis of the num- 
ber of poor children having high educational needs, as at present, to 
providing funds to meet the needs of all low-achieving children, regard- 
less of tl)eir income levels. The basic intellectual detenninant in the 
decision to stay with the poverty criterion for Title I was the recogni- 
tion that poverty brings problems of its own that deserve special educa- 
tional treatment and that children living in poverty areas who are hav- 
ing difficulty in school have a harder row to hoe than do middle-clas^ 
or higher socioeconomic status students who are having difficulty in 
school. The decision to stay with the poverty determinant has been a 
value decision, although political determinants have also served to pre^ 
vent any major changes in the Title I funding formula. 

testing and state resource allocations 

The 1978 Educational Amendments permit school districts, in 
certain circumstances, to p^ck their Title I target schools on the basis of 
the proportion of children in poverty. However,* although the formula 
f9t allocating aid among states and school districts is still geared to the 
number of poox children within those jurisdictions, school districts may 
now employ tests^ as wjtll as poverty as criteria for selecting schools to 
receive Title I funds and (without reference to their parents* income 
* levels) for choosing pMpils to receive compensatory services paid for 
with Title I funds. 

The studies conducted by the NiE showed that using tests for 
the allocation of funds to states and ^chool districts would require new 
and costly test development efforts. tx\ the thirjteen districts that were 
prompted to use tcyts experimentally in identifying Title I target schools 
(qnder a 1974 congressional mandate to NIE). no radical shifts occurred 
in the clientele or in the operation of thf program. The NiE studies 
showed that funds appeared to be allocated to a higher proportion of 
the district's pupils when tests were included as criteria, and, if any- 
thing, the concentration of minority pupils receiving Title I services 
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increased, whil* income levels rose somewhat. But the trends were not 
strong, and probably the major conclusion of this aspect of the Title I 
study was that the test approach to identifying target schools was feasi- 
ble within districts and caused no major program transformations (see 
National Institute of Education, 1977a and 1977b), 

• At the state level, there has been great interest in the last fouj^ 
or five years in^ trying to relate fui^g to educational needs. Ope 
approach is to have weightings in thcfcneral equalization formula for 
distributing state aid; that is, pupils with identifiable needs, such as 
the handicapped^ are given additional weight. Another approach is to 
have categorical aid (separate funding for high-need pupils) in addi- 
tion to equalization aid. In bothVases it is possible to identify pockets 
of educational need through test performance, and some states have 
adopted such mechanisms. The state of New York, for example, has an 
extra weighting of .25 for every pupil scoring roughly in the bottom 23 
or 24 percent of the pupils taking the Pupil Evaluation Program (PEP) 
tests at designated grade levels. The test scores jire used as indicators of 
need; they are used to pick out schools and districts with higher than 
average educational need in grder to allocate additional funds in pro- 
portion to the number of high need pupils (sec Goettel, 1977). Michi- 
gan has had a program for a number of years now in which compensa- 
tory education funds are likei^ise distributed on the basis of test scores; 
this program has been the^ubject of much recent discussion (see 
Murphy and Cohen, 1974). And California has a school improvement 
act that uses tests to identify pockets of nec^ 

Another way in which states kre now using tests as part of their 
allocation approach is as an accountability measure. F'or example, the 
New Jersey supreme court has interpreted a constitutional phrase— the 
provision of a ''thorough and efficient education'* — as requiring New 
Jersey to set educational standards and then ensute that local districts 
meet those standards by providing appropriate educational treatment 
for each youngster. Thui, tests are now being used to determine whether 
the state s responsibility to ensure each youngster a thorough and effi- 
cient education related to his or her particular need is being met. This 
process has also brought about considerable controversy. 

conclusion 

The use of test^o allocate resources has been under investiga* 
tion at the federal leveTsince the early 1970s. Tests have been rejected 
as criteria for distributing federal aid to states and school districts, but 
they are used ii$ select participating pupils for Title I programs and, as 
of next year, to choose target schools. In addition, regarding the reform 
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of state lystems of revenue distribution for public elementary and sec- 
ondary schools, a number of states have sought to attain a direct mea- 
sure of educational need and have turned to testing as a way either of 
identifying those areas in need of increased educational services or of 
showing whether or not the funding system achieves a suitable match 
between educational needs and available resources* 
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T«st scores and dettrrnmations of socioeconomic 
status are Used together in an effort to provide an 
equitable compensatory education program in New Jersey. 
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testing and funding: 
the New Jersey experience 

fred e. burke 



In fecent years, the emergence of several educational, soqial, legal, 
and economic factors has created a renewed interest in educational 
funding systems in which student performance on standardized testi 
determines the amount of money allocated to local school districts. 
One of those factors is the concern of the public and the educational 
community over what thf y perceive as a dagj^inc in student test scores. 
Without accurate measMres of the cognitivfand affective variables that 
influence student achievement, it is neither valid nor fair to pul great 
faith in comparisons between past and current student performances 
on standardited tests; nevertheless, somr policy makers assume that 
t^ts can idientify and define problem areas and that more moftey will 
^ provide the solution to them once they are spelled out. There is an 

^ inherent danger of oversimplification 0^ both these countl. 

A second factor contributing to the renewed interest in funding 
systems has been the far-reaching movement of the late 1960s and 
1970s to reform school finance. In many states, the courts have pro- 
^ vided the impetus for this movement; they have forced. legislatures to 
reexamine the fundamental moral and legal obligations of state 
>^ , governments to provide public school students With a thorough and 
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equitable education. Today, in contrast, we are experiencing a back- 
lash against state spending in many parts of the country; aa evidenced 
by California's Proposition 13, taxpayers arc becoming increasingly 
unwilling to shoulder growing tax burdens. In fact, tax revolt has 
spread so quickly that twenty states had ant i tax referendums on their 
ballots in the last election. Chiefly to blame for tliis phenomenon is the 
public sectors failure to maintain high or tven mediocre levels of 
accountability for funded programs; policy makers must now search 
for ways, to provide both an equitable distribution of funds and a 
reliable accountability standard. 

The policy behind every fttnding formula is to distribute dollars 
in some equitable fashion. AU too often, however, we become overly 
concerned with the funding mechanism and neglect the actual purpose 
for providing the funds. Therefore, we must decide rather early exactly 
for whom and for what purpose the money is intended. These are para- 
mount decisions both educationally and socially. For example, should 
funds go to youngsters who are educationally, culturally, or socially 
deprived or to children frohi families with low incomes? These are the 
kinds of policy decisions that underlie any method that relates funding 
to test results. 

dangers of test^based, funding 



The concept of test-based funding has changed the definition 
of frquality from emphasizing equal opportunity to stressing equal out- 
comes. In a few cases, traditional socioeconomic funding models that 

4 provided money to the economically deprived have been replaced by 
test based funding models; monies are now being distributed to dis- 
tricts whose students perform poorly on standardized tests rather than 
strictly to low-income districts. We must be aware of this radical change 
and of the precarious position in which it places educators. We may be 
offering the schools tremendous disincentives by providing payment for 
poor, rather than good, student performance. ' ^ 

Under a program of test-based funding, the poorer the perfor- 
mance, the more money a district receives — which is contrary to funda- , 
mental educ^ional goals and certainly tempts people to r^nipulate 
test scores to gain more money. And, even worse, once funded students 
manage to reach a predetermined test score, the funding is cut off. 

m There is a real need, particularly among disadvantaged children, for a 
continuous flow of money if we are to prevent the kind of cognitive and 
academic regression that followed the withdrawal of funds from vari- 
ous Head Start and FoUow^pirough programs. Still another drawback 

, to reliance on test scores is the concern of many educators-^ particu- 
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larly those on the local level - that scores on standardized tests may be 
misused to evaluate their teaching performance without appropriate 
consideration of other factors which affect the scores. * 

The greatest advantage and the most appealing^feature of test- 
based funding is the fact that money is channeled directly to measured 
student needs. This is ai) alluring concept* yet we muse be wary of its 
innate pitfalls. Fbr example, the Minimum Basic Skills Test currently 
employed in New Jersey has a narrow range of content. It focuses, as its 
title implies, on measuring students* proficiency in reading and compu- 
tation, the skills considered necessary for minimal functioning in our 
society tod|^y. However, the skills considered basic today may become 
obsolete fn the near future. We must concern oi^rselves with a broader 
range of programs and curriculums, particularly for helping our dis- 
advantaged children. Such children may master the basic skills and not 
require any additional money under a test-based funding scheme, yet 
they may be focially and emotionally below society s standards and des- 
perately need exposure to the socializing ^d maturing aspects of a 
compensatory education progi|ykm. Social institutions in this country 
are filled with people who possess the basic skills but lack either the 
ability or the desire to progress beyond the welfare or unemployment 
line or» even worse, prison. ^ 

alternative funding approachea^ 

Alternatives to test based fundings such as various equalization 
formulas, are currently used in many states. Equalization formulas are 
designed to reduc<f' disparities in p^if^pupil expenditures between 
districts. Equalization is easy to administer, but equity is hard to 
achieve. For example* we have almost equalized the local district tax 
effort in New Jersey but we still cannot generate the same amount of 
financial support for every p)iblic school student. We allowed local 
decision maken to determine where they Wfinted these funds to go, and, 
for reasons of political expediency, educational expenditures were not 
equalized. We can equali];e taxing capacity but we cannot equalise the 
values that are placed on Education or determine the priorities that 
people place on their actual or perceived needs. In short, equalization 
formulas do not necessarily benefit needy students, regardless of hdw 
they are funded.^ 

The allocation of Title I fwos is based on socioeconomic indi- 
cators such as census data. Aid t^amllles with Dependent Children 
(AFDC) counts, parental income, and parental education. Obviously, 
socioeconomic variables do not necessarily address the needs of or even 
identify all students who might need compensatory education. Also, 
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socioeconomic data, particularly census data, are often outdated by 
the time they are compiled. Nor do these indicators coijnpensate for 
rapid population changes or shifts, a fact that has been a major issue 
since the advent of declining enrollment and inter^ and intridistrict 
student transfers. In addition » local surveys intended to measure ipcio- 
economic status are often difficult to develop, have questionable li- 
ability, and are sometimes an intrusion on people's privacy. The AFDC 
count, which is updated annually » is an acceptable measure, but it is 
still a substitute. Unfortunately, socioeconomic indicators provide us 
* with information about corollary types of associations that all tod often 
do not accurately distinguish needy children from others. Lacking any 
more useful information, we make assumptions that have to do with 
parental education and parental income because they are categories 
that we can measure. In this way, we often confuse the purpose with 
the cause and begin to address problems (especially those of inner 
cities, such as poverty) that are beyond the range of education. In 
order to be effective, the Title I funding approach must be compre-' 
hensive, well orgaiMzed, and aimed at providing long range answers to 
the problems of social, cultural, and economic deprivation. In essence, 
the short- term « narrow- range funding approach can provide only 
short-term^ narrow results. Our society's ship will continue to sink if we 
continue to plug only one of the many holes in our hull, 

\y learning from experience 

New Jersey has a unique compensatory educatiorWunding for- 
mula. We use socioeconomic and testing criteria to determine the 
levels of compensatory aid that will be allocated to local districts. Our 
systems relies, in part, on a statewide assessment qjf academic achieve- 
ment. This test is administerecj annually to identify children whose 
scores fall below a state minimum stafndard of performant^e in grades 
three, six, nine, and eleven; we cao, (hen ^^ttttiate the scores of all 
grade levels that were not tested. HoWever; We believe that tn approath 
based entirely on test results would npt serve a significant iiifmber of 
the children in need of cpmpensatory education. A test 'can (irovide a 
direct measure of needs, but it f^ils tO address tide ^auiative factors 
related to low achievement and » tliic^refore, might lekd to neglect Jn 
areas that we do not test: 'We thus allocate hinds on the basjs of tii^o 
, indexes -test scores and sbcipecopomic status as indicated by AFDO' 
count. In any case, our ^w makes the decision very clear; it states that 
funds must be addressed to those who are ^tducationally, socially^ f ul; 
turally, and economically deprived/ ^ ^ j * 

I believe tl^iat the system in New Jersey makes s<uisible U|e of' 

ERIC 53 



49 



both socioeconomic in^cators and test results, which enhances our 
ability to provide an equita|)le funding progirani. JWe give a lesser 
weight to the test measure>than to the socioecdAoniic rt^asure and thus, 
both minimiz^hc diiincemivc^ for high achievement Characteristicf 'of 
simpler und j^^ty test based approaches, and discourage core manip- 
Illation* Mor^^r, we caij be relatively up-to-date because AFDC 
information is updated annually and we test annually. HoVever,%hile 
this system seems to have worked well So far, we realize that we need to 
provide an incentive to students to imp^e their scores. Some solutllons 
we have, considered are maintaining the same funding Ifvel over ^ 
.three-year period or providing an initial base level of funds and then 
allocating additionlil mojaey after improvement is demonstrated. 



reluctant recpgnition 



Many New Jersey legislators are not enthusiastic about test^ 
based funding. One reason for this is its rising cost. When it was begun 
in 1976, the New Jersey State Compedsatory Education Program cost 
132.8 millionj this year (1978), the cost is up to $68 million. This situa- 
tion presents us with a dilemma. If the need for compensatory educa- 
tion lessens, tbt^nds will dry up. If the need becomes greater, there 
will be grealCT^reluctance to fund the program. In either case, 4om^ 
children may be left without the st^mpensatory education they require. 

Many legislators today are becoming increasingly concerned 
that one third of the entire state budget is to be spent oq education. 
What is more! they are concerned that,, with our formulas, the deci- 
sions have already been made and all they can do is vate for or against 
them. It is perhafM understandable that the Itj^slators seeking to make 
the fundamental decisions want to rpcenter the decision making pro- 
cess. In. addition, legislators aifip beconj^ng increasingly sensitive to the 
attitudes of olr olde^ citizens./As the \ 
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population becomes more 
and ntbxe one of people who <^o not ha^^^ildren in public schools, we 
g^ihg to And a broadening credibility gap and greater diSsatisfac- 
tidd wii^ pur political processes and the way in whichnve determine the 
amount ot moqey available for education. Funding allocations will no 
longer be' made by means of what we know as the normal democratic 
prooss as the prdblen^ of semi*unlimited public programs competing 
fpt pnitb resources continues to grow. This is something which is going 
10 jiappen throughout the country. In New Jersey, I should add, we are 
illy mimdj^ted tc^ give account to the legislature on the status and 
lequacy of the use of the formula in aUocating^ funds. 

Finally, the legislature and state boar<} of education are con* 
f^templating the implementation of high school graduation standards 



50 



that iVill involve testing to determine eligibility not only for receiving 
diplomas but for remedial programs B9 well. Qne quesriori I think we 
have to ask ourselves jrj^ what extent such an addition^il ren^dial pro- 
gram is test-based and thus will requiyr additional funds. Where will 
those funds come from and to what extent will exist^ing compensatory 
dollars have to be reallocated frojti the lower to the higher grades? 
These questions, have imj^cations for every statd^. 

Statewide achievement tests were intended to identify children 
in academic difficulty. They;^rovidc information that is needed at the 
classroom and individual jjchool levels* .Unforttinately, however, that* 
information i^ all too often put to thK^rrong use. In New Jersey, for 
^example, results of ^tewide iichievement tests are one consideration 
in evaluating tenure/'tcachers. The test results are only one of several 
factors considered 'in those evaluations, but their use in thisrway has 
created and will create anxiety among teachers about 4heir 'continuing 
T employment prospects. 

upcocning policy decisions 

4 

Are all these uses of state t#st data conitpatible? Is it logical, for 
example, to use the state test both to evaluate schools and to distribute 
compensatory doUars to pay for the discrepancies identified by such 
evaluations? In New Jersey, must perform a district evaluation of 
every school each year. Bera^ning next year, w^c will have to classify 
every school as a,pprove4.*miapproved, or conditionally approved. A 
critical component in this classification is how wpU students do on tests. 
Ii> my opinion, we are asking one fess-than perfect instrument -the 
standardized test — to bear too much responsibility in educational deci- 
sion making. Would not too many uses of test data exert excessive 
iitfluence in the allocation of educational services? 

A second set of policy questions concerns whether or not test- 
based funding can or even should survive in a period of fiscal contrac- 
tio!^. How much of our educational resources should we distribi^e 
throiigh i test mechanism? What should be the role of the people's rep- 
resentatives irf a'demWatiy^ystem-fJfbuld they not \^ the ones who 
determine tH*-* pro port ion 6f funds that should be distributed rather 
than leaving it to^ a formtla? Wh^at if the students* scores "do not 
improve? WhAt happer\s then?/bocii this not provide extraordinary 
ammunition, for the decision makers who are no longer chi^d advo- 
cates? They may say that^est scores show that dollars do not make any 
difference. If ^nd when that happens, scores will become extremely 
dangerous for those of us who try to get the maximum amount of 
tnoncjy into public education. I here \i no direct correlation between 
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the amount of money we invest and the scores that emerge as a conse- 
quence of that investipent. 

Is there a limit to the funds that are needed for a test based 
fundng mechanism? The legislators say: •'Commissioner, your program 
started at |3Q million, and in three years you have increased it to $68 
mitlipn. When is it going to stop? How do you get the children out? 
You now have a program that is bigger than Title L" Since our total 
available resources are growing at a slower rate than are our test based 
allocatioiis»^jtest*based funding can only siphon away from other pro* 
grams. We have already seen this beginning to happen. 

At some point, we must decide whether consistently low test 
results warrant new money for the same old programs or if we should 
look to more radical alternatives. How. do we decide when we have 
reached such a critical point? And what will happen to test-directed 
funds whenl^st scores improve, as Indeed thev must if our current 
remedial effcffts have any validity? These are the\inds of questions we 
are beginning to ask and must try to answer not only in islew Jersey but 
throughout the United States. ^ . 

I believe that increased reliance on testing in educational deci- 
sion making, particularly when tests are used for allocating fUnds, may 
create difficult policy problerns.«-Thi«f apprpach? might inadveirtently 
lead to a reduction 4n resources for public education. This approach 
tends t6 over- formalize and over-simplify allocation decisions and, 
thus, to leave out key steps in the decision process. 

In New Jersey, we have found that combining test data with 
measures of poverty gives us a balanced system that maintains the test- 
ing component inr its proper role. Nevertheless, wc realize that funding 
mechanisms must be subjected to continual scrutiny to ensure that 
they are achieving their purpose. ' ^ 

reference 

Mathift. W. J. "Thr Use of Basic Skills and Socioeconomic Data in Dctcy^mining State 
Coln(jcnsatory Funding ,JEntitlcmcnts.** Papf n presented at the American Educa- 
tional Research Assbciatioh Annual Meeting, Toronto, Mar'th 1978. 



¥r%d E. Burke is Commissioner of Education of 
the New Jersey Dej^rtment Of Education. 



I 



Broads $i^ucational (om9qu€nces of tfshbasid 
funding must b§ takpi into account in 
d$signing and evaluating funding plans. 



testing and fundings 
m^asui^ement and policy issues 

george f. madaus 



• Tbc marriage between funding and test performance was first pro- 
posed by a select committee of the Irish Parliament in 1799; sixty years 
later, the match was fmally arranged by Robert L6we. In a time of 
severe strain on the exchequer brought about by the Crimean War, 
and in a time of increasing enrollment and concern over standards, 
Lowe tied the knot by formalizing the following recommendation of, 
the Newcastle Commission (Coulahan» 1975» p. 75): 

A searching examination . . . should be made ... of every 
child in every school . . . with the view to ascertaining whether 
these indispensable elements of knowledge are thoroughly 
acquired and to make the prospects and positions of the 
teachers dependent to a considerable extent on tlie results of 
this examination. ^ 

The match became known as paymtfcnt by results. For better or for 
worse, it wAi predicated on the assumption that ther^ is a positive 
incentive in Ijinking^eachers' salaries to pupil achievement on written 
and oral examinations in readilngi writing, jind arithmetic. Over the 
next three decadeSt the compatibility of testing and funding was 
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severely strained and the two separated in England at the turn of 
the century. Today, once again, in a time bf rising expenditures, 
accountability, and increased concern about standards, proposals 
such as that by Congressman Quie and programs such as 
Michigan's Chapter Three have reunited testing and fynding. Now, 
however, testing is used to indicate where funds for compensiltory 
programs or remedial assistance should be allocated. 

In order to fully understand the current relationship between 
testing and funding, several considerat ions that are central to pres* 
ent proposals for test*based funding need to be recognized. These 
include: the social implications of shifting the operational deflni- 
tion of educational disadvantage from an index of poverty to one 
of poor test performance (see Feldmesser. 1975; Kellaghan, 1977); 
the numerous technical, psychometric, and adntinistrative issues 
associated with implementing specific proposals (see Feldmesser. 
1975; Haertel and others. 1977; Madaus and Elmore, 1973); and 
related assumptions that the funds actually provide additional ser- 
vices to the disadvanuged and that the schools already know how to 
remedy the deficiencies that lead to low test performan»(see Airasian. 
1978; Airasian. Majtiaus. and PeduUa. 197b), These considerations, 
however, have been treated elsewhere and are beyond the scope of this 
chapter. Here 1 would like to consider test-based funding more broadly, 
I will argue that it is one indicator of an inexorable but unconscious 
popvlist movement in^many states toward a system of puWi^or exter- 
nal, examinations. J will describe the mechanisms by which tests can 
givjC an external agency various degrees of control over schooling. And. 
firially, 1 will evaluate the degree of such control that various proposals 
for test'based fitnding would give to the external agency. 

new support for external testing 

Tests developed by an agency outside the school have com- 
monly been used by governments to certify students* su(^:ess at one 
level of education and then admit them to either the next level or to 
civil service or other careers, A system of external tests, while not 
unknown in this country (witness the New York Regents Exams and the 
College ^pard tests), is. nevertheless, a rather alien concept that is 
more oelihinon to British and other European systems. As unattractive 
aisiich a system may be to American educators, however, the public is 
moving toward acceptance of testing by an agency outside the school^ 
according to recent opinion polls, A 1976 Gallup survey foHnd that 65 
percent of the ptiblic agreed that pupils should pass a state or national 
exam in orj^r to gradtal^te fr6rn high school. In a more recent (1978) 
Gallup survey, 68 percent of thf general public felt that pupils should 
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be promoted only if they pass an exam, and 53 percent felt t^hat such 
an exam should be prepared by either the state or the national govern- 
ment. In my home state of Massachusetts, a recent poll showed that 83 
percent of the public favored a basj^' skills examination as a require- 
ment for high school graduation (Clark University, 1978), 

At a time when testing is receiving fierce criticism from many 
academics, civil rights advocates, and prpfessional organizations, these 
poll results illustrate an interesting dichotomy of attitudes. While 
critics are castigating testing, taxpayers, parents, businessmen, legisla- 
tors, and much of the media are demanding more testing to increase 
accountability, return to "basics," eliminate the influence of sociap 
position, ensure minimal competencies, and improve standards. 

The expanding movement toward certifying minimal compe- 
tency for graduation and the increase in proposals for test based fund- 
ing are two conspicuous and explicit indicators of a trend toward exter- 
nal testing. Florida's minimal competei}cy tests, which would link high 
school graduation tomtc-level tests, are a clear example of the former 
(Haney and Madaus, 1978; Madaus and Airasian, 1977). The Quie bill 
and legislation in Michigan and more recently in Connecticut are pro- 
posals for test-based funding. These proposals involve external testing 
programs because of the need for comparable test data at the state Or 
district level when funds are allocated for remedial assistance* 

While the movement supporting external testing programs has 
been pressing forward relentlessly in the states, it appears to be dead ^t 
the federal level The National Institute of Education (NIE) is on record 
as being opposed to both national minimal competency tests (Graham, 
1978) and federal test-based funding (National Institute of Education, 
1977). The reasons for this arc not primarily technical or practical, 
although there are rnany interesting And complex administrative and 
methodological problems inherent in test based funding (sec Hamisch- 
feger and Wiley, 1977a, l»77b; Madaus and ^Imore, 1973); the rea- 
sons are fundamentally political. Powerful educational lobbies have 
opposed both plans because they correctly perceived that such testing 
programs could dramatically shift control of the curriculum to the fed- 
eral levcL The same ailment, of course, holds true at the state level, . 
but there. pro|)onents of test based funding and minimal competency 
programs hav^ been much more successful. ♦ 

the power of profkieiicy examt 

When results of external exams arc the sole or even a partial 
determiner of future educaltional or life choices, or when they arc used 
as a means to provide positive incentive in a substantial funding scheme, 
they influence what is taught, how it is taught? what pupils study, and 
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how they study (8€c Madaus and Airasian. 1977; Madaus, KcUaghan, 
and Airasian. 1971; Madaus and Macnamara, 1970; Srinivasan, 
1971). Thti^ech^nisn^^for such control involves the need to agree on a 
set of objectives that transcend district boundaries. This in itself is a 
sticky point for many since these objectives, although perhaps mini- 
mal, might, in fact constitute actional, or more likely a state-level, 
syllabus. However, it is the test that measures this syllabus and that is 
used to monitor, certify, or allocate funds that is the linchpin of the 
control mechanism. Control oVer the curriculum, teaching, and learn- 
ing is mediated through a process that Europeans call the ^'tradition of 
past exams.*' In most external exam programs — the College Boards 
exams and Florida's minimal competency program are notable excep- 
.tions the tests move directly into the public domain once they are 
administered; over a period of time, teachers, pupils, and parents 
learn to infer from the tests what is important. In reality, the tradition 
of these tests defines the injportant objectives of the schools. It is this 
tradition that gives the testing agency the potential for enormous con- 
trol over the curriculum and. conscJqucntly. over the teaching and 
learning process. ' 

Such (!bntrol is a double-edged sword. On the positive side, well- 
defined ahd Valid performance measures have been powerful forces for 
redirecting teaching and effecting cUrricular change (see Bloom. 1950; 
Commission on Mathematics. 1959; Morrfs. 1969). Given our present 
emphasis on 2( return to basics, or mastering minimal competencies, 
this could be an important benefit. On the negative side, however, 
most studies have found that curriculun\. instruction, and learning 
regress to the tradition of the f&sts; the proportion of instructional and 
study time spent on various elements of the curriculum is seldom ' 
highfr than the predicted likelihood' of their occurrence *on the exam 
(see Madaus and Macnamara. 1970; Norwood Rtport. 1943; Spaulding. 
1938; Srinivasan. 1971). Further, the Irish Intergicdiate Board of Edu- ' 
cation (1971). during the payment by results era. articulated a now 
familiar complaint when it deplored interschool comparisons *'that 
forced schools into competition With ont another — competition 
which is naturally injurious to the best interests of secondary 
education * (pp. xi. xii)r Fresen^proposals and prograrps for test-based 
funding or for certifying minimal competencies using nprm or cnfCV^ion- 
fefcrencfcd tests certainly permit and encourage interstate or inter- 
district comparisons (Madaus and Elmore. 1973); 

optioiii for test-based funding 

l*he amount of money available and the wliy it is allocated 
determine the extent of control exercised by the external ag<^ncy * 
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through itt tcsu. In the nineteenth century, wheh test results were a 
key-tlemcnt in fixing a teacher's salary, the effects on all parties 
were devastatingly negative (Herbert, 1889; Holmes, 1911). Exacer- 
bating the situation was the fact that the tests used by the school 
inspectors changed very little from year to year. Matthew Arnold 
(1899, p. 136) then oijc of the school inspectors, cynically described 
the payment by results system as a **gamc of mechanical con- 
trivance in which the teachers will and must, more and more, Icam 
how to beat us.** Rather than teaching for the test, teachers were 
eventually able to teach the test itself- to crVm their pupils with 
the answers to perennial questic/hs. 

In the recent past» we have seen the emergence of perfor-. 
Ifnance contracting, a close relative of payment by results. In both 
cases, money was linked to test gains. In performance contracting, 
the contracter receives payment; in payment by result the teacher 
receives payment. Like its ancestor, performance contracting often 
substitutes cramming for learning. To my knowledge, there has 
never been an attempt to link substantial financial incentives to a^ 
neiif test each year based on a stable but well-defined domain of 
minimal objectives. If such a system were attempted, the tradition of 
the teits would soon become a powerful force in the schools. One could 
predict that, after a few%ears, the distribution of' those passing the 
tests would stabilize at a very high percentage. If we are talking about 
basic 9r minimal skills, some may argue that this is exactly the distribu; 
tion we want. But there is a tradeoff. Given our testing history, the 
multiple choice format might be expected to quickly and uncritically 
dominate the external test and thus might, unfortunately, influence 
the kind of teaching and learning that takes place. 

An alternative to a positive incentive, test gain funding plan is 
one that links funding levels for remedial assistance programs to low 
test performance, This is still an externa^ test^ program, but it js one 
whose effects on the curriculum and on teacffiig anil learning should 
be slight — so long as safeguards are built in to discourage schoolsTrorn 
implicitly or explicitly taking steps to depress scores on which funds are 
allocated (see Feldmesser, 1975; Hamiscbfeger and Wiley, 1977b) and 
so long as the continuation of funding is not linked to test score gains. 
This describes the current situation in Michigan. However, if cpntin^ 
uation funding is reduced when pupils make test gains, as lihder Quiets 
plan, a strong negative incentive is introduced. To avoid such a nega- 
tive incentive, the Michigan Chapter Three legislation originally set up 
a two tiered testing program that tested initially to allocate funds for 
low-scoring pupil/and ihfn again to link continuation funding to suc- 
cessful test performance. Dis^^icts ' would recetVt full allocation the 
following year for each low-scoring pupil who achieved 75 percent of 
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agrccd-on objectives and a proportionate amount for partial gains. 
This use of test results for continuation funding caused gqjisideralile^ 
controversy. It was perceived not as an incentive but as a penalty—' 
a device that would be used to single out teachers with pupil failures 
(Murphy and Cohen, 1974). Consequiently, this continuation funding 
component of Chapter Three has never been implemented. If it had 
been, it is likely that the tests fventually would have influenced teach 
ing and learning in Michigan schools. 

Mosher (1973) suggests an interesting use of a two-tiered system 
of test based funding. He feels that commercial norm -referenced 
achievement tests ?re the most suitable devices for initially allocating 
funds for remedial assistance programs. He suggests, however, that a 
different type of achievement test is best for evaluating the effective-^ 
ness of these programs or for niaking decisions about continuation 
funding. It can be argued that, for a number of reasons, norm- 
referenced achievement tests tend to measure general ability rather 
than school-specific achievement. Such achievement tests correlate as 



highly with so-called intelligence or verbal ability tests as they do wijth - ' 
.one a4|other, and they also correlate highly with home background 
(Coleirtali and others, 1966). Thus) they afford a realistic index of the 
(^fficuw the school will have in teaching low-scoring pupils. However, 
because of their psychometric properties and the collusive effect of 
home and school on the traits they measure, these general achievement 
tests are hot particularly sensitive instruments for assessing changes in 
the schools effettiveness in reaching specific instructional objectives. 
Thus» Mosher (1973) argues that they should not be used to evaluate 
the effectiveness of programs. Instead, he suggests that tests geared 
, specifically to programs' instructional objectives be employed. 

The effects of using such objective-referenced tests to evaluate 
programs should be benign if the funding level determined by the 
norm referenced achievement /abilty tests is not affected by the out- 
come of the evaluation. However* if continuation funding is tied to 
gains on a test referenced to a set of commpn statewide objectives, then 
-the potential impact of that test could be great indeed. Thus, I should 
like to suggest a variation on Mosher s plan. Like Mosher, I woMid first 
allocate funds on the basis of a general norm-referenced achievement/ 
ability test. The state could require districts to modify their programs 
on the basis of Subsequent evaluation results but could not use those 
results to reduce the initial funding level However, a bonus^ight be 
paid to districts for every economically disad^vantaged pupil whose 
Kores reach^some agreed-on standard on a test geared to a set of com* 
petencies for a particular grade. Safeguards would need to be huilt into 
the program to avoid the segregation of these bonus eligible students. 
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to keep them from bciog shortchanged in other aipecti of the curricu- 
lum, and to guard againit accepting minimumt ai^the norm- Such a 
plan might capitalixe on one positive impact of external exams— the 
structuring and focusing of instruction - for the students the schools 
have h^(\ the least success in reaching: the economically disadvantaged. 

are we ready for government testing? 

Whether or not compensatory funds should be allocated on the 
basis of test scores comes down to d<?ciding whether or not our society is 
willing to accept a federal (or more likely a state-level) external testing 
system » The acceptanice of such a program would alter the present sys- 
tem of American testing. Instead of a system in which local districts use 
..privately developed tests in traditional ways (which I feel have minimal 
impact on the schooling process) we would move to a system in which 
tesu use4 by the state Itnay have a profound influence on the curricu- 
lum, as well as on instruction and learning. The effect on the balance 
of power between the local district and the state would be a direct 
function of the rewards or sanctidhs associated with the use of the 
ej^temal tests. 

Can a system of test-based funding be built that could alter the 
present balance? Absolutelyl Should we then move in this direction? 
That is not primarily a measurement question, although there are 
measurement issues involved; it is a question of values, politics, power, 
and control. Whatever society decides, we must be aware that a system 
of external testing linked to funding involves a delicate balance: it is 
not a marriage made in heaven, 
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High school gradmiion should m«an that th0 
' school has providtd a suitahU program of 
Itammg activitits and that the stpd^fit 
. .has atthin«d defm^d Uvtls of pwformanct. 



tests and diplomas: 
certifying high school 

education 

mark r, shedd 



Tmis and diplomas are the subject of considerable discussion these 
da^— both inside and outside of education circles. Their purpose, in 
many minds, is to certify or validate education. Frankly, I think we 
could benefit from a careful lopk at both. . 

The high school diploma has long been a symbol of stud^t 
accomplishment, representing the sum total of personal and academic 
achievement. But recently it has taken on new significance. Now it is 
expected to certify liot only individual attainment but also the success 
of the schools in providing quality ^ucation. This is a subtle but 
important shift in emphakis. As a mechanism for public accountabil- 
ity, the diploma must have more than personal, individual signifi' 
cance; it must have universal validity, And that, in turn, requires mea- 
surable standards for earning a diploma. < <l 

This shift in expectations has produced tremendous pressure to 
move in the direction of' standardised tests, as well as raging contro- 
versy over the subject. $upporten of minimum competency testing 
point to illiterate graduates, frustrated, parents and employers, social 
rather than competency-based ^motions, and the like as reasons for 
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using minimal performance teiu as staWards for high school 
gMduation. But opponents argue that such tests prove little » 
educate not at alk and can be seriously misused. Minimum expec- 
tations for schoolsAthey fear» will soon become maximum expecta- 
tions for students. 

While my own bias is with tl^ose opposed to such testings I 
believe that both sides are correct — at least insofar as their facts 
are concerned. There are serious problems afflicting the schools » and 
they demand our immediate attention. However^ testing is a simplistic 
reaction to a complex set of problems that demands a more thoughtful 
response. 

accountabHity in education 

The fundamental issue here is accountability. That very 
popular word represents one of the most basic foundations of our 
democratic society: the ability of the people to demand that their 
public institutions account for the quality of their work. This is per* 
haps particularly true for our educational institutions. Because we 
Americans value our schools enormously, we expect a great deal from 
them and we spend a great deal on them; we therefore have every right 
to demand accountability from tlj^e institutions we support. 

But what exactly do we expect? Despite arguments that the 
schools havj^ tried to take on too much, there are broad areas of agree- 
ment at^out the purposes of education. In general, we want the schools 
to help our children learn to communicate and compute, to become 
capable of making a living, and to be good parents and neighbors, as 
well as wise consumers and voters. Schools should hplp children form 
and express opinions, make judgments^ solve problems, be creative, 
and enjoy their own lives and the world around them. To borrow from 
other writers and educators, we want our schools to enable children to: 
find pleasure in the exercise of their minds, to help them realize their 
potentialities, to educajte themselves throughout their lives. 

We demand a great deal. And we are deeply concerned about 
the quality of education in America. The latest Gallup poll te]lls us th^t 
two thirds of the American public believes the quality of education is 
declining. And, while I continue to argue that schools today do a bet- 
ter job of educating more youngsters than ever before, I (jccognize that 
there are students who are not learning; there are teachers who are not 
teaching; and, therefore, students] parents, and taxpayers are being 
cheated' by the schools. 

Clearly, we face a difficult dilemma: we must spend our energy 
addressing the public concern about the quality of our education while 
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at the same time ia^ttcmpting to analyze ancf resolve the prpblems 
involved. The crux of the matter is that education is hard, if not impos- 
.sible, to quantify. That is^a galling realization for th#^ state legisla- 
tor whose constituents demand that something be done about the 
quality of education. It is frustrating for a reporter seeking a neat 
definition of a ^ successful'' public school. And it leaves no recourse for 
a parent ^hc^ suspects that lH child is not beir^g educated. So we 
atterppt to **measure" education With competency and proficiency 
tests. And so great is the pressure for accountability that thirty-three 
states now use some form of standardized testing and every other state 
is considering it. 

While apparently logical to many people, the testing response 
has many flaws. It is a simplistic approach that ignojibs the drawbackif 
of proficiency and competency tests: they are limiL^ainst rumen ts mea- 
suring limited numbers of things; thty ^ne alwau|^iased in some^way; 
and they can identify problems but not caiftftor solutions. It alto 
ignores the potential for misuse of proficiency tests. If used to deny 
promotioo or graduation -a practice that has never been proven to 
benefit students- tests have the effect of blaming students for schopis' 
failures. Such tests are equally unsuitable for use as the sole judge of a 
schools success. I am deeply disturbed by the growing tendency to 
compare one school with another only on the basis of standardized test 
scores. That '^bottom line ' approach is a meaningless device of the bus- 
iness world that JMj^fair to students, schools, and the puBfic that 
believes it to be There are, of course, many valid uses of tests; 

these range from diagnosis of individual learning problems to the eval- 
uation of whole propamaover time. Furthermore, tests should play an 
important role in the overall accountability process. But tests are inap- 
propriate as the ultimate measure of education; no test has been 
proven to accurately predict success in adult life. 

an alternative to testing 

The demand for accountability — from students, parents, and 
ta)cpayers-i8 legitimate and, practically speaking, too powerful to 
ignore. We must Revise some valid way of certifying that an acceptable 
process pf education occurs between the first and twelfth grades. One 
reasonable alternative to testing is taking shape in Connecticut. This 
alternative is not the perfect solution to education's problems; nor is it 
entirely indepeirtlent of the testing approach. With its establishment 
this year of a statewide proficiency c*am and with its proposal last year 
of a statewide comp<»ten,cy based test for granting high school diplo- 
mas, Connecticut hopped on the testing bandwagon, too..-But there is 



some in^ortant restrain^ in. this toting progrs^m: Connecticut is casing 
into a comprehensive prqg^ram of accoufitability that inclqdes testing 
but is ndt limited to it. ^ ^ ^ 

The proficiency ah^competency te^ts I/havc just mentioned act 
important components of the accountabiliji;^ program and warrant 
some explanation. The proposed competency based high school 
diploma test is the more controversial of the two, because even though 
it is an optional program, aimed. mostly at out of School youth, it does 
establish statewide criteria for a high school diploma. 

Like iViost stfiies/ Connecticut has had a state testing program 
for high school eq^uivalenCy fpr. years. Limited to those over age twenty, 
this' test gives adults the opportunity to earn a high school (^^loma. 
However, a year ago a^study group, established under Connecticut's 
Master Plan for Vocational and Career Education, recp^njnended th^t 
a me^asu^e of *'comp4tency"'(»r applied skills be added to tfie^est; the 
group believed that the high school diploma should reflect such com- 
petency as well as 'deiiionstrated academic proficieficy. The proposal 
wai|^ aimed chiefly atwoung^men and womin — many of thenji 4vop- 
outs— between the agJp of sii^teen and twenty who might not otherwise 
have the.opportunity w earn /k diploma. Tbe study group also proposed 
lowering the eligibilvy age for the test, thereby permitting some 
students to graduatejtarly. 

A source of cbtitroversy because of that "early \cxit" provision, 
the proposal awaits fuuher action and funding frqm the General 
Assembly. However, there arc a nuintoer of features— details that do not 
. mak)t breadlines of the proposed ti*st which would make it itn'iixipor- 
[tant part of the pverall accountabiliiy process. (This is a very impor- 
tant option »for some youijg people wha may not "have the choice of 
staying in school for the last.year.or^wo. For themVthe opportunity to 
earn a high school dipl^Arta and the ^arnt]l{( power that ^o^s with it are 
essentiiil.) Our proposal, would i^llow stud<^nts to ftest'out** of school 
only with parental permission, arid then Only after intensive counsel- 
ing.' Students who Had dropped out of school were ^Iso to be given 
^ounseUhg and encourag^rd to return to class to earn a diploma. 

At the time that it was asked to act on this proposal, the Con- 
necdcut Genei'aJ Assembly was preoccupied with measures, for profit 
eiency testing. Since then^ Connecticut has joined the mainstream in 
passingva proficiency testing b(]l, although with some restraint. For one 
thing, tntt bill is called the E(Aication E^valuation aiiid Remedial AsAis- 
tihce AcL 'ancl it^ritt&ns just that: it is based on evaluation and 
lr<^niedia1 as^ittanc^, nOt xompetency tests and not requirertients for 
promotion or fair graduation. It Wj)l not be the ultimatrdrbUer of stu- 
l^fits success chL^^ft.^The t»kh intended to'^valuate Mudent profi- 
cWtt'dy in Jbasic academic ski^s and to assign r<?mcdial assistance where 
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: needed. Furthermore, it is not just one test. The act requires local dis* 

• tricts to test students at three grade levels between the first and eighth 
grades, and it calls for a fourth test to be {idministered by the state to 
all riin^ graders. 1 he choice of grades is intended to provide ample 
time to correct problems indicated by the tests. The law also requires 
each district to plan a co^i^rehensive testing program before adminis- 
tering it.^nd it assigns staO^^money for remedial efforts. 

As state tests go, this is not a bad model for an accountability 
device. It is aimed primarily at schools' not students and its chief 
purpose is to provide aid nbt jabels ftyr^students. Furthermore, it 
steers (Hear of the testing for promotion trap^ Buti it is not a perfect 
solution, I would prefer to see the statt^wide iv^i administered at both 

* grades four and eigh^ rather than grade nine, thus allowing more time 
for remedial help. And I am stilf deeply concerned about the possible 
misuse of test results. After our experiences with SAT scores, I cannot 
believe that this will not be a problem. Realtors will find the scores 
helpful in identifying "good" school system^; parents and taxpayers will 
use them to compare students, schools, districts, and teachers. And, 
while I know such use ts distorted, unfair, and unhelpful, I also know it 
is unavoidable particularly since state money wilf be allotted to towns 
on the basis of the number of students who fail these tests. In develop- 

, ing regulations for the law, we will be working to minimize this prob- 
lem as much as possible, 

^ I he most encouraging aspect of this testing legislation is that it 
is regarded in i broad context of accountability. By and large, neither 
the legislature nor the pJublic assumes the test to be the sole answer to 
thCL^accountability issue. This is primarily becaiw? Connecticut is dcalr 
ing \mh a larger issue at tHe moment the stSteAupreme court ruling 
in Hort)bn v. Meskill that Connecticut's system^ financing schools is 

^unconstitutional. Like California, New Jersey, and other states, Coh- 
necticut is thus faced with the prospect of redesigning not only the 
financing of education^ut the structure of that education as well. 

ex4mi/iir^ ''suitable'' education 

Our statutes not only demand equal opportunity for all stu- 
dents and a reasonable level of funding; they also require that opportu- 
.nity be provided each student for something called a ''suitable program 
of educational experiences,'' which, to date, has never been officially 
, defined. In order to sh&pe an equitable finance system, ifbecamfc clear 
very early th^^ we would have to define a ''suitable'' education as well. 
- 1 hat process is not yet complete, although a final proposal is now 
being prepared for review 1^ tlve school finance advisory panel, the 
^tate board of education, and the Generah Assembly , Nonetheless, 
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after months of discussion, debate, criticism, and advice,, a consensus 
has emerji^ed that gives shape to a definition of. suitable education. So 
far. we hpve successfully resisted the urge underlying the competency 
movement to quantify ''suitable" in terms of a specified list of pro-* 
grams or student requirements. There will be no state curriculum or 
state graduation requiremerits and that represents more than just a 
(loferral to New Kng^nd's penchant for local autonomy. It reflects an 
understanding of the education process as one that must be flexible, 
responsive, and individualized, not merely convenient for adults. 

Connecticut s ''suitable ', education program is shaping up as a 
series of guarantees to students of apprppriate opportunities for their 
education. I'he guarantees will also assure parents and taxpayers of the 
accauntabiUty they demand for school performance. Among the ele- 
ments of a suitable education program are various state and local goals 
and oljijectives; minimum ciirricular o^eringsr minimum funding; 
appropriate staffing, equipment, and supplies; adequate systems for 
managing, evaluating, and improving -school progf^ms; and, finally, 
an effective evaluation and reporting system including the varioiU'test- 
ing programs mentioned earlier. In addition, the key to the process is A 
remedial program to be enforced by the state ^when a schoj^l system, 
taken as a whole, fails to provide a program that meets ^hese criteria of 
suitability. 

; So what does this definition of suitable education have to do 
wiOi diplomas and FTigh xhool graduation? I believe that requiring 
accountability for the outlined elements of a suitable program ik the 
beiit way to certify high school completion. There are four major exit 
re^quirements that 1 believe rrmst be met for each student; the responsi- 
bili^v for the*se falls predominantly on the school: 

I 1. l school must certify that each student has had equal 
access to a quality 'education throughout his or her twelve years of 
schooling. Kvery child miist be guaranteed protection from discrimina^ 
tion that prevents hlim from receiving the education he requires. 

!^rhe school system must have |>rovioled each student vyith a 
broad range of learning opportunities in both basic and ap|)lied 
skills that will enable him or her to function successfully now and in 
future life. 

3, rh** System rrmst have help^'d its students along the way to 
reach their full potential. No school system can force a child to learn; 
but every school system is responsible for aiding and encouraging rht 
child. I hat means using our vast wealtii of knowledge about learning 
to identify ( hildreirs talents, abilities, and il^erests, to uncover learn 
ing problems, and to solve them. It is here that tests of itiany varieties ' 
rnay play an important rolev 

4. Finally, there rrmst be dear expectations of whi^^t students 
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should accomplish. The schools arc responsible for establishing such 
expectations: the students jjre responsible for meeting them. The high 
school diploma should continue to be a personal statement of accom- 
plishment representing participation in certain activities and sufficient 
achievement in basic academic ^skills. Ideally, those expectations 
should be established on an individual studept basis. At the very least, 
they should be decided by the local school system and the local com- 
munity. They should not be set at the state level. Students, parents, 
and educators should all ^gree on the value' and significance of the 
high school diploma and the qualifications for earning it. # • 

' / . ^ * concliision 

JVIany would like to see high school education defined by a sin- 
gle standard, a neat listing of t*t accomplishments that all graduating 
students will havt. efficiently identified by a single test score. However-. 
1 cannQt agr^e. I like to think, as George Bernard Shaw did. that edu- 
. cation is "the child ift pursuit of knowledge, and not knowledge in pur- 
^ suit of the child * (Peter( 1978. p. 173). I cannot and will not believe 
that the ultimate goal of education is achieving a minimum score on a 
single test. We want evrryone to go beyond minimum level. We want 
each cl^ild \o reach his or^her maximum potential. And each child is 
different; there is no test that measures that difference effectively, 

- But ther4? is no need to "c6p out" on the accountability issue, I 
propose, instead, a dual accountability. First, we must hold the schools 
firmly accountable for opportunities for learning. They must guaran- 
^ tee each studen^ the instruction, evaluation, and special assilitance he 
or she requires. Only in this way can we effectively certify the success of 
schools. And 1 believe that if we take care of the first part then the sec- 
ond pirt - the certification ofj^tudents will take care of itself. This 
does not. hovwver. relieve the students of responsibility; they should be 
held accountable for th<fir own performances. 1 he Students should be 
fully award of what is expected ^of them, and the diploma should be 
their reward Jcir meeting those expectations. Legally and morally, we 
owe stydents the opportunity to learn. But we also owe them what they 
owc/fiemselvcs: an expectation of the excellence of whicb each of th^m 
is capable. ' 
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Programs for determining plademeht in remedial 
instruction!! certifying competence in reading and writing 
skills, and evaluating instructional programs illustrate 
hpw testing can influence the awarding of college degrees. 
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testing <^nd t^e 
college degree 



r. robert rentz 



It is not my intention in this chapter to answer the question of whether 
or not testing should be involved in awarding college degrees. That 
qliestion is quite compelling, but, unfortunately, I do not know the 
answer. A much less compelling question, but one that is much more 
manageable, is; How can testing influence who is {^warded the college 
degree? Testing has, of course, always been used by those concerned 
with awarding college degrees. Prebably more tests are administered in 
* college classrooms during the first months of the fall term than arc 
administered over several years by the College Board and the Ameri- 
can College Testing progrsin combined, and improvements could cer- 
taiiUy^be made in many of those dfusrooip tests. However, my concern 
is nTit with the testing program that originates with individual profes- 
sors or even with the faculties in specific departments; rather. I ain 
concerned with the kind of testing that receives its major impetus from 
outside the faculty -suggested, mandated, or legislated by adminis- 
trators, governing boards, or state legislatures -and that is used to 
assess minimum competency for granting diplomas. Numerous exam- 
ples of thise ex^rndly mandated testing programs may be cited. At 
the state level, for instance, arc the Georgia program that I will 
describe shortly, and the program required by the recent Florida law 
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that calls for the administration of entrance and exit exams for teacher 
education candidates. Other programs are unique to individual higher 
education instituMons. There is generally substantial faculty involve- 
ment in their development and implementation of such progr^s. but 
^he impetus for them comes from outside the faculty. 

I think the primary motivation behind the development of these"' 
testing programs is the popular belief that college graduates are simply 
not as well, educated as they should be. While there is little direct evi-. 
dencc of the performance levels of today's college graduates, there is 
much poputai; comme^. Anecdotes abound about the graduate who 
cannot write letters of application, memoranda, or even simple sen- 
tences. There ate other stories about sixth grade teachers w^o cannot 
read at the level of their own students, jit is difficult for nM4ttj:iUzens to 
believe that four years of college experience willjjart^orjifa freshm 
class, over a quarter of whose members mu^takei:emedial English and 
math, into a group of graduates that can Wnction a level expected 
of college graduates. Factors Mich ^s the necessity fbr^ninimum com- 
petency testing of high school graduates and generally declining Scho- 
lastic Aptitude Test (SAT) scores tend to erode the base of confidence 
irithe ability of the entering college student. Thus, in the absence of 
ev^ence to the contrary, the notion that the college graduate is some- 
hovf educationally deficient persists. >^ 

\ In 1972. partly as^ a response to this general uneasiness 'and 
part^ ti^ gather information for program improvement, the university 
systt'in of Georgia began a testing prqgram designed Jo assess the read- 
ing and writing skills of college students during their sophomore year. 
The Georgia system, composed of thirty three state supported junior 
colleges, senior colleges, and universities; quickly discovered that some 
25 to 30 percent of its students could not achieve the minimal levels of 
pcrforn^ance expected of them in the two tested areas. These findings 
were partly responsible for the establisfiment of a formal statewide 
remedial progri^m in all institutions, accompanied by extensive place- 
ment testing of incoming freshmen. At the same time, to h«lp individ- 
ual departmenfts evaluate their programs, the university system inau- 
gurated n\ajor area examinations to be given to bachelor s degree t^n- 
didates at thcjir exit point. Hills (197.7, p. 9) calls these entrance and 
exit testing activities **a very extensive and elaborately coordinated pro- 
gram of tesl[ing. No other state has anything quite like it.** 

* the Georgia programs offer illustrations of several functions 
that testing tan perform in awarding the bi\chelor*s degree. In the 
reirtainder df^this chapter. I will focus on three of these functions: 
placement, Certification, and program evaluation. 
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te»ting for ituder^t placement 

Testing has been used for flection in college admissions for 
many years. Test scores, along with High school grades and other infor- 
mation, have assisted college admissions* officials in making decisions 
abttut who would be admitted. When the number of applicants far 
exceeded the available space, high selectivity was the general practice 
followed J?y most institutions. In recent years, however, the opposite 
has been true; generally, there has been more space than applicants, 
and the pressure to maintain enrollment leveb has necessitated less 
reliance on previously used selection criteria. As the ratio of space to 
applicanCf has changed in favor of the applicants, the type of admis- 
sions decision that must be made has also changed: selection decisions 
have become placemeht decision*.. Selection means deciding whether 
or not to a^mit particular stndehts^ whereas placement involvjjs deter- 
mining which level of instruction i)T type of program is best suited to 
individual applicants. A rather comprehensive explication of place- - 
ment options has been provided by^ Willingham (1974). who empha- 
sizes accommodating individual student differences by matching stu- 
dents with appropriate educational programs. Placement decisions 

. typically involve such options as jrxemptipn from particular courses, 
advanced pl^cemerit, or the use of remedial programs. This last option 
becojhes .-increasingly prominent as more of the less-qualified appli- 
canu are' adrnitted. 

Placement ih remedial programs in Georgia colleges involves a 
two-stage decision that uses test information. All applicants for each of 
the thirty-thi:ee institutions in the university system arc required to 
submit SAT scores. Students with a combined verbal and math score of 
less than 650 (on a scale of 400 to .1600) arc required to be further 
tested with ? set of tests called Ihe Basic Skills Examination. Students 
who score below an institution's cut off point on any of the three parts 
of the Basic Skills Examination -math. English, or reading -must 
enter that iii^titution's formal remedial program in those areas in 
which they are deficient. Before exiting, the students in the remedial 
program must again take and pass the part or pJ^fts of the Basic Skills 
Examination tijat they previously failed. Students arc allowed up to 
one year to complete these requfrements. Those who begin but never 
complete the remedial programs and still enter the regular college pro^ 

'gram will never receive degrees. Thus, in this sense, passing the Basic 
Skills Examination becomes a requirement in itself for obtaining a 
degree. * 

J/hilc this Ose of the Basic Skills Examination involves the func-' 
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tion 1 have chosen to call certification » it Suggests certain characteris- 
tics of tests of this sort thatt should be mentioned here. Using the Basic 
Skills Examination for both placement and certification at thirty-three 
institutions throughout the state puts a severe strain on both test secur- 
ity and, to some extent, the credibility of the certification process. 
Such a problem can only be solved by issuing rlew test forms at fairly 
frequent intervals. In fact, the requirement for multiple, equated test 
forms issued on a regular basis is necessary f<A the successful imple- 
mentation of most certification examinations.ftncluding those in the 
minimum competency testing movement. In Aporgia. we have dealt 
with the multiple forms problem by abandoninltraditional test devel- 
opment procedures, as well as the purchase* oloff-the-shelf tests, in 
favor of an it^m bank approach and the use ofmtent trait methodol- 
ogy. The Basic Skills Exams arc developed loca^ on the basis of the 
Rasch model. Rasch model procedures provide simple and efficient 
solutions to the problems of equating test forms, and they offer the 
benefits of an item sampling approach to the item analysis task (see 
Rentz. 1978). 

testing for certification 

The clearest example of Georgians use of tests for certification is 
the Regents' Testing Program, which assesses the reading and writing 
skills of students during their sophdmore vear, Passing this test is 
required for graduation with etthcr an assOTate or bachelor s degree. 
The policy of the board of regents of the university system, adopted 
in 1972, contains the following statements (Board of Regents. 1972. 
pp. 554 555): 

It is the responsibility of each institution of the University Sys- 
tem of Georgia to assure the other institutions, and the system 
as a whole, that students Obtaining a degree from that institu- 
tion possess the basic competence of academic literacy, that is. 
certain minimum skills of reading and writing. . . . Students 
enrolled in degree progra^kvill be Required to lake and pass 
the test. . . . Passing the tfl^ a requirement for graduation. 

The battery of tests used in the Regents* Testing Program is 
called the Language Skills Examination. These tests are given four 
times a year to about 30.000 students. Students are permitted to take 
th<f tests as many times as desired, subject to any required remediation 
policy of the local institution. (Board policy require* the local insti- 
tution t6 provide a remedial program for those failing the test, and it 

ERIC , ^ . 



75 

permits the institution to require the student's participation.) The tests 
in the Language Skills Examination are locally developed. The reading 
test is a conventional multiple*choice comprehension and vocabulary 
test« but the writing test is an actual written essay. 

The content of these tests is determined by representatives of 
^he faculties; their aim is to define a minimum level of performance 
that can reasonably be expected of a graduate regardless of the institu- 
tion attended. In such a large and diverse student population, what 
proficiencies can be certified and how? Insofar as content is concerned, 
there are four options: (1) certification on a course-by-course basis^ the 
process currently in common use, in which each professor assesses stu- 
dents' competence by assigning grades; (2) certification of students' com- 
petence in their major areas of study, an option in widespread use but 
usually operated by recognized groups outside the college (examples 
include state teacher certification boards and other professional licens- 
ing and certification boards); (3) certification based on a core curricu- 
lum --a common body of content that each student is expected to mas- 
ter and that is very difficult to define; and (4) certification of basic 
skills, the solution illustrated by Georgia's Regents' Resting Program. 

4ft 

testing for progra^ evaluation 

The Regents' Testing Program also serves a program evaluation 
function. The percentage of students who have passed the l^anguage 
Skills Examination in each institution in the Georgia system is reported 
regularly. The results vary widely among schooU. which, over the 
years, has resulted in extensive studies of lower-division programs, 
particularly English composition. Program evaluation is not its major 
thrust, although the Regent? Testing Program can lead to changes in 
programs; the impact its results have had on curricular programs has 
created mixed feelings about its overall effectiveness. 

The Major Area Examinations, sometimes called senior exit 
exams, represent a testing activity in the Georgia system that can be 
readily identified with evaluation. These exams are selected, ad%inis- 
tei%d. and reviewed by the local institution. -Each department selects or 
devri^es its own exam, but the tests used most frequently are the 
advanced tests of the Graduate Record Examinations. Each^aduat- 
ing sctnior must take a Major Area Examination; psychology majy^rs 
take aj)sychology exam, biology majors a biology exam» and so on. 
HoweVj^r. there are no passing requirements; the results are used by 
each alpademic department as part of a review of its academic pro- 
gram. $jnce thi$ particular testing program Js relatively new, its useful- 
ness is yet to be determined. * 
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Conaiderable testing takes place in the Georgia'system, and one 
thing is clear: testing influences curriculum. It would require consider- 
ably more space to describe all the various ways in which these testing 
progranishave influenced the higher education system in the state, but 
I will mennon a few related to the Regenu' Testing Program. Course 
content has changed. New courses have been added. More essay tests 
are given and more in-class writing is done in other courses besid<^ 
English. And faculty are increasingly conscious of and concerned 
about their responsibilities to students in teaching basic skills. 

The reactions to these trends have been both positive and nega- 
tive. One junior college dean writes, ''I believe the Regents' Test has 
done more than any other single device to improve the quality of 
higher education in this state" (Austin, 1978). Yet the head of an 
English department (Corse, 1978) declared: ' 

However, because we now are devoting our best efforts to get- 
ting the largest number of students past the essay exam as possi- 
ble, we are teaching to the exam, with an entire (^ourse, English 
111, given over to developing one type of essay writing, the 
writing of a five-paragraph argumentative essay written under a 
time limit on a topic about which the author may or may not 
have knowledge, ideas, or penonal opinions. Teaching this one 
useful writing skill has the beneficial effect of bringing large 
numbers of weaker students to a minimal level of literacy; but, 
at the sam* time, it devastates th'e content of the composition 
program that should be offering the better student challenges 
to produce writing of high quality. Because the Regents' Test i^ 
primarily designed to establish a minimal level of literacy, wr 
teaching to this test, which its importance forces us to do, teAds 
to make the minimum acceptable competency the goal ofour 
instruction, a circumstance that guarantees mediocrity. I 

conclusion j 

In this chapter, .1 have approached the issue of how testin^Mn 
be used as a determinant in awarding college degrees by describing sev- 
eral testing programs in the university system of Georgia: These pro- 
grams illustrate three functions testing can perform — placement, certi- 
fication, and c^Valuation. In some ways, these functions influence the" 
individual directly; in others, students are influenced by program 
changes brought about by the testing. Ai we have seen, testing can be a 
powerful agent for change. If we areiiow facing an era of more wide- 
spread use of tests for determining <||figibility for college degrees, then 
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college! mu5t be aware of the impact such testing is likely to have on 
their campuses. 
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D$cisions on what kinds of mtervenhons to to 
evaluate and what kinds of data to coltect are 
needed for wise use of evaluation resources* 



critical dei^^ 
evaluation studies 
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The human services field presently seems to have climbed onto a pla- 
teau, and it appears able to gain additional altitude in small incre- 
ments only by expending considerable additional amounts of effort 
and resources. Publiq and health care efforts have lowered the death 
rate close to its asymptotic minimum and defcreased its variance due to 
^ social class, ethnicity, race, and place of residence. Further reductions 
in mortality are going to be difficult to achieve, will require very hVbvy 
expenditures, and may increase undesirable side effects. Similarly, we 
have gone about as |ar as we can go with our criminal justice system in 
working to keep crime ynder control. Further- progress may take more 
effort than we can afford. In education, compulsory school attendance 
until ages sixtt^en to eighteen plus the availability of state suppotted 
colleges and universities have brought our country a long way on the 
road to universal literacy. But irdning out the variations in educational 
attainment and imellectu^ll functioning that currently exist^s clearly a 
. difficult task. In short, the easy problems in all these fields have been 
met fairly well; the diffitult ones still lie ahead. Indeed, the more prob- 
lems we tolve, the more ditficult are the djilemmas that still rentain. 

Corollai7 to this generalization is the fact that in field after field 
new interventions we might devise are not going to have spectacul|r 
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cffcctsxjtfc ^ mo5t likely to consider changes that amount to little 
more thai^^^trnketing with existing systems. While their inventors and 
advocates^ m^jMout such changes as fundamental alterations in our 

• human servic^Tthey most likely will turn out to be small variations on 
fxisting thcmca. For example, the mass-produced textbook was a 
fundamentally important innovation, one that Ucs 'at the base of our 

J educationaksystem. And innovative matepals for programmed \nstruc- 
tion can be viewed as simply new tynJFof textbo««f^ perhaps better 
than many but not fundamentally different from most and certainly 
not as different from regular textbooks as those textbooks were from 
whatever preceded them. 

Thus, the changes we try out on our human services are minor, 
while the problems we must deal with become increasingly intractable. V 
As a result, any innovation is likdy to produce effects that are weak \ 
and inconclusive at best, ^t is this likely outcome that underlies the 
trend toward increasingly r^orous evaluations of intervention^, ti is no 
longer plausible to evaluate educational changes through direct 
inspection, through the judgments of experts, or through the reports of 
persons experiencing the changes. We have learned that detecting the 
effects of interventions requires considerable precision in measurement 
and powerful research dcsigfis. Thus, as the problems become more 
difficult and the interventions become weaker in their effects, the 
rheans for detecting them liiust bccQme finer; acquiring definitive, 
valid information about intei[ventions requires considerable effort, 
resources, and expertise ojjcn at levels that appear inappropriately 
expensive in relation to the role such informaticJBvinight play in policy 
decisions about the interventions in question. 

when not to evaluate 

Given an intervention -some new procedure, device, organiza- 
tional* rearrangement, or whatever- that appears promising, what 
kinds of measurement information might a decision maker need in 
order to determine v4iether or not that intervention is worth installing 
in an educational institution or system? Clearly the amount of informa- 
nion that would be desirable is dependent on two characteristics of ^he 
intervention in question. The first of these is the cost involved: expen- ' 
sive interventions, taking into account not only capital and operating 
costs but nonmonetary *sts as well, would call for more and better 
information than would relatively inexpensive interventions. For 
example, it makes absolutely no sense to attempt to measure the 
impact of using plastic rather than steel paper clips, a judgment that 
should appear obvious to all. 1 he second characteristic is an intcrven- ^ 
I 
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tion's underlying potential: an intervention that may cause some * 
harm should be evaluateljkmore carefully than should one that* 
appears to be completely benign. Of course, harm should be viewed 
quite broadly; it should include possible damage to organizations as 
well as damage to pupils and other persons in the educational 
systc|»s^volved, i^; ^ • ' ^ 

^ \vhm seated obversely, these two principles have implica- 
tions that .ay rtm brdinarily taten into acco\mt. The obvdiie says 
,that if an ir^erv^htion is inexpensive and obviously TJjenigiAheri^ it 
is not worrfi evaluating as to whether or not it has any impitct on 
jome^partitular* educational outcome. The cost of obtaining such 
information may often more than offset its worth. Furthermore, , 
this is just the sort of intervention tjiat ishikely to have little impact, 
and precise estimates of suol^impact are extremely costly to obtain. 
For px^mple, providing enough montfy to persons released from 
state. prisons to enable\hem to survive for a irtonth or so^ould not 
be an expensive intervention, and it wduld clearly be , heFpful^To 
prisoners who are usually released with sums between $25 and $50. ^ 
}n addition, siTch provisioh might help reduce recidivism by easing 
fhe transition to civilian employmenf. Detecting such effects is 
likely t&^hc .very expensive, although providing the additional 
^oney^is relatively cheap. As another example, an educational ^ 
intervention th?t would make available to high school math stu- 
dents inexpensive hand calculators is probably not worth; evaluat- 
ing with any gr,eat precision. Similarly, a federal ^program that 
wo^ provide >|5 annually to school systems for each chHd from a 
poverty level household is not worth evaluating as far as impact on « 
the studffnts is concerned. "The additional fynds could not possibly 
hurt either the school systerris or the pupils, and the cost of properly 
evttlu^ting whether or not such funds had a positive impact on 
ptt^il learning wo&ld be extremely expensive. Implementing such a 
progratm might be a waste of money, but evaluating it surely Woutft^ . 
be even more of ai waste. In short, for inexpensive, clearly btnigil 
interventions, some basic furors are acceptable. 

An additional kind^of intervention also should riot evalu- 
ated. Indeed, a very good case can be made for the belief that such " 
interventions should not even bp attempted. I refer to "black. box"* 

fkervcntions - the kind for .which no specified rationale, theory, or 
odel postulates how the intervention is going to accomplish itA ainis. 
Per,h4ps the most frequer^ly employed black bdx interventions are 
those that involve giving uneai[marked funds to school ifystems w 
schools in the vague hope thai* they will somcliow, improve^hcmsclves. , 
But many other interventions are proposed as well, such as Head Start 
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prograniB and educational youchers. whose specific mecHanisnlj^or 
producing effects are left unstated. T 
f Black box interventions often arise crut of what may be called 
failure of nerve. Often, despairing^ of finding any specific intervention 
for which som^ sound rationale can be proposed, an intervention is 
Mevised tli^t primarily consists of providing incentives for innovation. 
Thus, a school system^mi([ht tu!^ over direction of local schools to neigh* 
borhof]^ school boards in the optimistic, democratic belief that local 
parentff^may be bettef ?ible to Specify a school curritulum than are 
system wide school boards. Thetc may be good rcasons fbr decentraliz- 
ing control of schools, but ^the idea that such moves will somehow 
imjl^ove the schdols because parents are better at making educational 
decisions than are educators is not one of them. Allimilar exar^iple of 
faHure of nerve is subcontracting«to private firms to provide instruction 
under a profit-incentive system^ Here the rationale is Bimple, tf unsatis* 
factpry: in the absentee of any notion of what to do to improve the 
schools, simply provide profit incentives fcfr improvement, arid improve- 
" ments will appear. The contract learning experiments sponsored by 
the Office ^ Econoqntic Opportunity (Gr^mlich and Koshcl, 1975) 
dfn^on^tratcd hoW futile suck attempts wcte in producing startling Or 
effective innovations, a.t Icajt in the short run. * 

X|||^ niain'reasoh that black box, ^ilurc-ofnerve interventions* 
are not^orth evaluating is that We learn ^o Itttile from doing so. Eval- 
uation- of a well thought through intervention, with specific gb^lst^lld 
clear, means for reaching them provides decision makers valuable 
information about what to do next iC the intervention fails or is only 
margj^nally successful. In the case of a black box intervention, howcyf r, 
since the mechanism of success is unknown, wc ^re unable to sort' out 
^tYit essential from the inessei^jtial aspeq^ of the intervention and hp m:e 
will likely bttunable to r^prdduce or enhance desirable effects in dmev 
settings. EvGMntio;is ^of such interventions chat^act^ristically provide 
go or-no go information and do not add cumulative knowledge bgiseii* 
Ther^ simply is^o substitute for general understanding and theory iti 
the desigti of interventions.^ Mindless iljmovation may produce some 
movement, l>u^i^^ill result in Uttjc progre^. , • f 

quettioni an Wsiluatpr should imitvef 

Th^ copsideratlbns raised so fan in ^um, add to a posUive 
definition pf an evaluable intervention --that is, one that is worth the 
funds that myst be expended in or^r to det^n^ine with sonie preci- 
sion whfther oMiot ii \i having its de4k<;d effects. lEvaluable interven- 
tions cart be Refined as those that have clearly defined (and mesyiur- 
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able) goals, are .theory based, would require a heavy investment,'!^ 
fund tq operate as a program, and may potentially inflitt some hari ^ 
ful effects on persons or organizations. The more of these qUalifM 
tions a proposijd intervention has, the more carlfully it should' 
evaluated* And, coijverselyp the fewef sucK qualities are present, tfie' 
less wordkwhile it Is as an interventiion and the less worthwhile evaluat* 
ing its enectiVeness may be. 

If a proposed intervention meets these qualirications,fhow can 
its subsequent evaluation best provide useful information to decision 
tnakers? Tbe solution requires answers to three interrelated evaluation 
(|uestions: (1) la. the intervention effective to a significant de^ee in 
achieving its goals without substantial negative effects? (2) cl(h the 
intenreifyion be delivered in^plcniented) successfully within the 
OQttnizational context in which it would be embedded? (3) Can the 
intervention produce benefits that justify the costs— both monetary 
and nonmonetary— that arc necessary to achieve its intended effects? 

Is an Intervention EfftcHvef Since evaluation research came 
into fathftn ten years ago» we have accrued sufficient experience with 
applying powerful research designs under field conditions^to learn ho^ 
to discern the effectiveness of interventions; typically, thfe main obsta* 
ctes to applying .such research to the evaluation of interventions are 
time and money. We have learned that it is both possible and feasible 
to carry out quite elaborate randomized controlled exBerimdnt^ under 
field conditions* We have also learned through experience that, with 
proper safeguards, nonexperif^emffT^atistical methods can also be 
used with considerable confidence in assessing the ^hpacts 6f interven- 
tions. ' * ^ 

In shojrt, it is possible » provided that we are willing to spend the 
time and can afford the costs, to obtain quite precise, i^nbiased esti* 
mates of the effects of interventions. But there is a difference between 
statistical proof and itslhipliCat^ions for policy decisions* The best the 
researcher can do is provide data^^owing that an Intervention does or 
does not produce statistically significant effects. Often, however, when 
seen in the light of policy needs/^atistically significant effects arc not 
important. For example, the Educatipnat Testing Service's (ETS)pval- 
uatioi^of Sestime Street (Ball and Bogatz, 1970) showed that chi^lreri 
who hrfd viewed that program had progressed farther loward an 
^undcrstandtng of certain basic relationships, could recognize more let- 
ters of the alphabet, and had a clearer understanding of some rudi* 
mentary arithmetic operations than had childrjpn who had not seen the 
. program; however, there still remained the qvption of whether or not 
such results were significant Trom a p0licy viewpoint. The ETS evalua** 
tion showetjl that after the end of a year's viewing, on the average, 
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-vvLewcrs cduld recognize tWo rtiore letters of the alphabet than could 
those who did not look at the program. But do two a<nlition,^l letters 
represent an increase in learning significant enough to justify^e effort 
that vifini into the design of the program? A somewhat similar case 
involves the Head Start program. Whet^her or not it was effective is 
apparently a controversial issue among evaluators; however, even if we 
accept the most optimistic of the several findings, ci(n we say whcthefi 
or not its pqsitive effects are significant enough to merit policy attention? 
Furthermore, decfding policy significance involves making 

' judgments as to whether or not unintended side effects cancel out 
positive primary effects. In his reanalyiis of the Sesame Street results, 
for example. Cook (1975) founcf that the program had stronger effects* 
among n[iiddle-class children than among poor children; the end result 
was a general widening in the learning gap between the two socio- 

/ economic levels of viewers of the program. Siipiilarlv, in Seattle and 
Denver income mauntenance experiments, k sl^ht Work disincentive 
effect was shown tof result from income maintenance, payments, espe- 
cially among^eco^ary workers in households: mothers of young chil , 
dren, and adolescents. This effect Was statistically significant, but its 
policy implication^ were npt clear; though at first glance appearing to 
be a major drawback, in sonjp respects - mothers x)f young children 
withdrawing from low- paying jobs to keep house and rear their Vhil^ 
dren and adolescents remaining in high schpQl until graduation — it 
could be judged m a positive outcbme. In addition, it was found that 
the payments^terell the breakup of marriages; this, too, may at first 
seem negati^*lHa the payments may have provided sufficient income 
security to free women from unhappy marriages (Hannan, Tuma, and 

^ Groeneveld, 1978). ^ 

In short, while the researcher can now feel confident that his 
measurements (an provide precise and^ unbiased estimates of the 
effects of imerventions, this information may not be relevant as far as 

\policy is concerned. Policy significance is. not equivalent to statistical 
sij^iflcance. Judgments still must be made about the appropriateness 
of the magnitude of the effects and wtiithcr pr not there is a satisfac- 
tory tradeoff between positive ri?sults and negative side effects^. 

Can the Intervention Be Delivered Successjullyf The most use- 
ful eitimatea of intervention effects result from randomized controlled 

* experiments; these must be ruj by researchers who carefully implc- 
tf\cni intejventi6ns under condition!! that ensure its delivery^to appro- 
priate targrt groups. Effects, then, are best tpeasured when an inter- 
vention is delivered in a standard way at Its intended full Strength. For 
this re^son^ randomized experiments on a grand scale liave been pri- 
marily concerned with transfer payments as interventions (these might 
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include income naaintcnance payments, subsidized health insurance, 
housing allowances, unemployment benefits for released prisonei^, oi' 
similar aid). The dflivery of transfer |toymertt8 is failly readily 
evaluated because it can be conducted with*h ihe framework of a ran- 
Clomiz^rd experiment that simulates or imitates clo&ely the ways in 
which such paymerit^i would be delivered when embodied in a statutory 
program. Human services delivery, however, is not sg easily measured; 
unfortunately, many human services interventions that work quite well 
in randomi;«ed controlled experiments admini«tertrd by researchers 
often fail miserably in other contexts. This is primarily because it is dif- 
ficult to standardize the delivery, of human s^rvices\ espcctally When the 
deliverers aii||jpro(es5ionals who have considerable autonomy in the 
exwcise of t'heir professional functions, largely as a consc^quqnce of the 
failure of interventions tfcat work well und^r ^iig^hly controlled sit.iia- 
rtOns to work at all in the field pr institdl^ionaJ context, ^ ' • 

The expectable difference between pilot runs ;ind prt)duction 
•runs often means that an intervention must be tested twice? It is tested 
first within the context of a carefully controlled experiment; results 
from such an experiment prqyide estimates of' an intervention s effec- 
itveness under the most favorable circumstances as administered by 
the dedicated'designer of thjl- interveiitiorx. It is then tested within the 
context of the Institution that will be j^iven the delivery niandate if the 
intervention' i^ incorporated imo statutory policy. 

Perhaps moit representative df the sort of double testing sug» 
gested here are the experiments sponsored by the Department of Labor 
(Rossi, 6erk, and LenihanV forthcon^ng) concerning the efficacy of 
extending unemployment insurance be;iefits as postrelease financial 
aid to ex felons. Such aid, of course,* is intendetj^ reduce rdtcidivism 
by easihg the transition to civilian employment' Fhis intervcntiqn was 
tried initially on a small scale in Baltimore as a randomized experi- 
ment; it was run by a devoted social resWj^her Mflio administered pay- 
ments and provided job placement «<;rviccs with the aid of a small 
but conscientious staff. The Baltimore experiment produced very 
encouraging results, reducing arrests on property* related charges (that 
is, burglary, robbery^ and larceny) by about 8 percent, ayttefiy 25^ 
percent ^eduction in recidivism for such charges during t^e post- 
release year as compared with the control group. I he Department of 
Lal)or then tested the same program it\ more policy-relevant settings 
by having tht departments of corrections and unemployment sectfrlty 
in Georgia and 1 exas administe/4^^on a trial basis as a randomized 
controlled experiment. As administered by those agencies, the inter- 
ventio<i had no significant impact on arrests on property-related 
charges during the^ postrelease year in either state. Under sgin^ condi- 



86 



tiom, financial aid waf jipparcntly effective — but not when it was 
delivered by the kinda of agencies that would be responsible for it if4? 
were enacted as a national policy. 

Program implementation in^lves other measurement issues 
besides evaluating the effectiveness of interventions. The success oha 
program depends on how effectively it. is implemented. Hence, mea- 
' anting service delivery is a problem in the administration of all pro^ 
grams, especially those, such as human services, that must rely hl^avily 
ojn personnel for delivery. From the perspective of decision mjfMAg, it 
i« necessary to know not qnly whether or not an intervention will work 
under certain circumstances but also whether or not it vjf^U work within 
the context of the institution that will have to administer it. This sug- 
gests that policy fnaking^should be more tentative in establishing a pro- 
gram, making provisions fqr both close tracking of how well the prp- . 
gram is being implemented and periodic checks on its effectiveness as it 

is delivered. > 

« , ■ * 

Do the Intervention's Benefits Justify its Costs? The general idea 
behind benefit -td^ cost analyses is quite simple: policies that^ create 

•benefit^ greater than their costs are the only onjs worth endcting, and 
pdlicies with high benefit to cost ratios make better use of resources, 
than do those with lower ratios. Going beyond this general idea to the 
calculation of benefit to-cost ratios, h(jwcver, one leaves a simple world 
and ent^ers a maze of intricate computations. To begin with, benefits 
and costs may be regarded froip many viewpoints — from those of indi- 
vidual recipip.nts of an intervention to .those of individual taxpayers, of 
the institution' involved, and finally, of the government administration 
or the society as a whole. Very costly educational interventions may^ 
offer very high benefit to-cost ratjos to rec^ipients but fractional ones to 
every other party. 

. Second» calculating a benefit to-cost ratio requires reducing all 

I benefits and costs to sdmecoitimoh metric - usually monetary— uni^s, 
This niay make 'sense in the calculation of benefit-to-cost ratios for 
dams and irrigation systems, whose mam effects may be calculated in 
monetary teFiT\s, but hoW can we measure hdw much a person benefits 
from learning more matl]ri^ What is the benelTit to society ofraising the 
national average of inath scores on the Scholastic Aptitude Test (SAT) 
by two or three points? It is clear that there are some societal benefits, 
but it is dif^cult, if not impossible^ to measure such benefits in terms 
that would make sense to everyone concerned, * 

Finally, benefit-to cost ratios a^e generally very sensitive tb the 
diKount rates applied to expenditures. Since investing monies at a 
given time on an hiterventioir means thaf ajltertiative investments can; ^ 
not then be made that^ might accrue interest over the future, it is 



necessary to compare the worth of the present expenditures discounted 
for the future worth of invest mem\ihernatives. Discount rates are, 
largely conjectural and, for interventions that call, for a fairly large 
amourjt of present day expenditures; benefit-to cQ$t ratios can vary 

^widely: ^ • . ^ 

•For these reasons, benefit to cost calculations, at least a9 
applied to social programs, tend to approximate the truth value of 
science fiction -they are interesting, |)erhaps even insightful, but they 
are mainly the product of some fertile imagination. This is paffticularly 

: true of benefit to-cost calculations applied to programs wl\ose effec-' 
tiveness has not y^t been tested but is simply taken for granted. Costs 
and benefits, of cimrse, should not be ignored. Indeed, I hold the con- 
trary view. Calculations of cost effectiveness — that is> the cpst of a 
delivered unit of effectiveness -are especially useful. For example, 
given a program that yi effective in raising the average scores on some 
standardized test of reading ability, it is possjbhr to compute how much 
each unit gain in reading scpres costs. As a further illustration. Cook 
(1975) reports that, by his calculktions, eath additional letter of the 
alphabet learned by a preschool child through exposure ^j) Sesame 
Street costs approximately |.25, And. in the Baltimore experiment 
conducted by the Department of Labor, it costs about |l^i60Q4!o avert 
each incident of recidivism, an amount th^ may seem cxcessjiyf^ until 
one compares it with the costs of processing an arrested person through 
the criminal justice systert> and maintaining that* person in jail for a 
typical two year sentence; 

Calculating cost effectiveness requires cloSe monitoring of costs 
and units of services delivered, ai^ well as measures of effectiveness. The 
same research operations and measures- with the addition. of cost, 
accounting - that can be used to ntdnitor the delivery of services can 
provide the basic inform«cion used to calculate cos( effectiveness, 

conclusion 

This cha'pter has examined some of the major issues thi|t arise 
in measuring the effectiveness qf and making decisions about interven- 
tions. Assuming that precise and accurate measurements of the effec- 
tiveness of irfterventions are expensive, t have stressed that there ale 
circumstances under which One should not undertake measurement: 
some interventions are simply too trivial to waste* resources on, and 
• Others are SjO pogrly Refined thjlt any measurement is bound to be bat- 
fling ind equivocal. For those interventions that are evaluable, I have 
illattralrd theToniiderable difference between the effectiveness of an 
intervention conducted under pilot -run conditions and the effective- 
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nets of one adminUtcrcd by th? inktitution that will have the ultimate 
retponiibility foe it if the progra(^ enacted. I have alio expressed a 
pessimistic view of benctfit* to-cost calculations as being largely con- 
jectural and ordinarily highly dependent on shaky assumptions. In 
their places, JJiave stressed the usef\ilness of measuring cost effective- 

Measuring the effectiveness of interventions and the costs asso- 
ciattrd with measuiyd effects provides only one part of the information 
that goes into tt^e dechion-making process. No matter how well an 
evaluation is conducted, it would be naive to expect tl\e resulting 
measures of effectiveness to have an all determining impact on the 
decisions of policy maCbrs. There^iire many reasons to enact interven- 
tions into policy without considering their effectiveness. For instance, 
eqjLiity (^pnsideraiions may completely outweigh considerations of effec- 
tiveness.^ In addition, constituency demands arising from clients, 
organizations* and perhaps even suppliers may ippear more cogent to 
'^ecision makers than the representations of evalu^ition researchers. 
Indeed, would one have it otherwise? In a democ^tic society, is it not 
better to have policy that is responsive to the push and" pull of politics 
than to the outcomes of social research? k 
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Throughout the last decade, and most dertainly during ^he past few 
years, lawmakers and educational administrators in the federal gov- ■ 
emment have relied ihcreasingly on measurement and evaluation in ' 
making their decision^ The reasons are clear enough: a growing 
demand for accountability in education and a cleaf need for 
accurate yardsticksio measure the efficacy of educational systems, pro- 
grams, curriculums, arid student learning. Congress wants more objec- 
tive, pragmatic evidence on which to base its decisions about whether 
support for programs should be increasecj, decreased, or abandoned. 
Educational administrators, teddies, parents, pffice of Management 
and Budget staff, and otheff ouuide the legislative halU'also want 
objective evidence to support their proposals and prografns.. 

My current role involves working with all 120 programs admin- 
>tered by the U.S. Office of jfducation (OE). In ^ capacity I can.Bce, 
though somewhat dimly, the constellation of foAi'that focus on the 
congress and the administration and attempt to pirsuade, cajole, liire, 
or threaten them into taking appropriate action. In this chapter, I will 
examine the influence of testing and research otk two major 
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conccrnf — Title 1 of the Elementary and Secondary Education Act 
(ESEA) and bilingual education. These two areas exemplify, the 
increasing impact testing is having on decision making at the federal 
level; in addition, they show how testing may be applied in other 
educational program areas. 

testing and Title I " 

Title I of ESEA is the flagship of federal elementary and secon- 
dary education programs. It was authorized in 1965 to provide special 
educational services to educationally deprived childrwi in low-income 
areas, and its budget has grown from $959 million in fiscal year 1966 to 
over $S Mllion today. Tide I funds now go to 14.000 of the 16,000 
school districts in the country. Measurement, in the form of evalua^ 
tion, has been an integral part of Title I sincfe its enactment. However, 
first-generation evaluations were basically efforts to find "successful" 
Title I programs^ By todays standards, U<fty were relatively primitive 
and imprecise, and they were ina(ji<;quate and unsatisfactory for the 
purpos^} of the OE and Congress, kyact, there were serious discus- 
sions aboUt whether to perforni rad^iPHurgery on Title I or to aban- 
don i( aitV>j^et her. . . 

\ ln:i9l4(f pOingTW amended Title I by adding several new duties 
for the U;,S. Cd^^mis*iOn<jr of Educ^^^^ was \o: 

strengthexi ithe ire^uVement fpr iitjjicfpendc^^^ eva(t#jjtuons of Title I pro- 
grams jand j()t6|p<i^^^ st?indaiJ?)a^ for evaluating the 
y ^ effectiven^w ;ttf thoi<? j^to^ami jiijrtt^^^^p^ states to pro- 
vide jointlyj^pnsoted, itjjit^iVc/*^^^^ with 
. evaluation models » utiililtirig bl^jfcjttiv* fcriw]r^* i^nd to pYo- 
duce data that aire (Jompara^)!*^ oh :a stiit(!i)ii^idc ijin^^ ji^U b^sis; 
and provide states with technical assistance fw (fcve^^^ apply- 
ing their evaluation programs. T' ' / 

As a resplt of the 1974 education amendments, the National 
Institute for Education (NIE) and OE conducted a number of studies 
and surveys. Contrary to earlier frndrngsu these studies showed an 
increase in achieveincnt for Title J childreri. Moreover, the NIE study 
indicated that the effectiveness of Title I programs directly correlated 
with the quality of administration : progranjs.that were adfhinister^d 
well tended to be better "th^p poorly administered programs. These 
studies led to the ^jencral cortclusion tljat Title I was indeed working 
and that it could be made even better^ \ - 

Congressional action in 1978 reflects, in part, the result| of the 
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was rtiadr. Including the new concentration proposal at |400 million, 
the total Title I appropriation will be over $3.4 billion; this new total 
> represents an increase of 27 percent over last year, the largest increase 
in. the history of the program. Furthc^more^ Congress placed more 
responsibility on the skates for monitoring Title I programs and incor- 
porated into law some of the provisions in current rcgulations dealing 
with program admiitistration. 

The research conducted hy NIE was also gpod for the agency. 
The report of /he Senate/Houat Committee on the reauthorization ot 
ESEA stated, "The high quality and extremely useful work accom- 
plished by NIE in the ESEA Title I study was particularly influential in 
hnpressing the Congress that the institute has grown and matured. It 
now represehts a««nique^and solid resource [that] administrators ^nd 
educational policy makers can depend on for the study of difficult ind 
previously unknown areas [that] affect learning and thc/ducation pro- 
cess, well as national education policy issues" (Conference Report on 
H. R. 15, 1978. p. H12224). ' 

The education amendments of 1 978 reflect Congress* desire for 
still more and better measurement. A lengthy section on program eval- 
uation specifies tha; the commissioner of education shall continue to 
Ijrovide for independent evaluations of Title I programs and projects 
as well as technical assistance. In addition-, the commissioner must 
report the results of evaluations to CoDfgress no later than February in 
♦1980. 1982. and 1984. 

A number of l itlc I evaluation studies ar^ currently under way; 
three that should be^complet'cd by next spring undoubtedly will have 
substantial Impact on Title I legislation next year. Those conducting 
these studies are seeking to determine: (1) what percentage of students 
retain ftUl4t) spring achievement gains during the summer; (2) the cost 
effectiveness of the various types of Title I services; and (3) the nature 
and ext^t of parental. involvement in the educatioi^ of children. 

Talk about the failqres of Title I has virtua^^y disappeared. 
There is no Question that constituency pressures and .social needs over- 
shadowed any test results in deter^nining funding levels for l i tie I. Yet 
Congress clearly wanted to ensure that the dollars appropriated were 
being used wisely: congressional committees continued to pfess OE for 
evidence th&t the program.s were working. But in this sessian; Congress 
turned away fr^m questioning the desirability of having .such a pro 
gram, instead focusing its attention on making I ule I more flexible 
and more effective. Without data documenting student succe.ss and 
pointing the wayT toward program rermements, I seriously doubt that 
such positive congressional action would'have heen taken. 
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tvalu«tion and bilingiial education 

An estimated 15 million persons onimited English speaking 
ability live in this country. About 24 .percent of them, or 3*6 million, 
are fouV to eighteen years of age and therfore of particular concern to 
our public and private 'schools* An overwhelming number— 69 per- 
cent, or zJ million — of tliese young people speak Spanish* Only five 
other languages account fdr more than 50.000 persons each: Italian, 
French. Filipino. German, and Chinese* 

.In 1968. Congress enacted the Bilingual Education Act as Title 
Vll of ESEA and appropriated $7*5 million for bilingual education* In 
1974. the U*S* Supreme Court, in Lau v. Ntchok, ruled that the San 
. Francisco school district must provide special programs for children of 
limited English-speaking ability* Althcfugh the court did not specifi- 
cally require bilingual education, that approach was one option for 
meeting the new requirements and assuring equal access to education* 
Following that decision. Congress substantially broadened Title VII to 
help states and school systems better serve non- English-speaking stu- 
dents* New Amendments called for more deliberate and systematic 
^ teacher trkiiiing and curriculum development* They also authorized 
funds forcreating resource centers to help teacher^* as well as materials 
development centers ind assessment and^issemination centers* 

Unfortuhately. research in bilin^al education to date is frag- 
irient*iiry and inconclusive* A major stud^ of the subject was conducted 
by the American Institute of Research (AIR) under a $1*5 million con- 
tract with t)E. Results of this study/ released in the spring of 1977. 
caused reveifberations in the educational community that are still 
being felt today* They found that less than one third of the students 
participating in^ bilingual classes were of limited Englisjhi speaking 
ability and t)^at» in the judgment of teachers, approximately three 
.fourths Tof the fourth, fifth, and sixth gradpm^ Title VII classrooms 
were either English Nmonolingual or English-dominant bilingual 
^students (Danoff» 1978)* The researchers also noted that, in their study 
sample. Title VII students had ilightly lows grades in English thadi 
did students whb were not in Title VII programs; in mathematics, 
across gtades, they were pcrfbrming at about the sapie level as studenU 
not in Title VII* ' 

In August 1978. the National Conference on the education of 
Hispanics issued a statement saying that the AIR report had been ''seri- 
ously questioned by several independent researchers of renowned com- 
'* petcnce.** The conference went on record as repudiating the report 
^ * and passed a resolution asking that 0£ also repudiate the report and 
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take steps to replicate the study. Others have made similar requests 
- (U.S. Office of Education, 1978). 

It is understandable that the Hispanic community would be 
upset. When a rese«ch study that seriously questions the results' of a 
program designed to address some of the long- neglected cultural, lin- 
guistic, and 'academic cdncerns of Hispanics, it is to be expected that 
such a study would itsiftf be subject to iritense scrutiny. It was not 
unlike the early days of title I when negative results appeared so fre- 
quently. • 

Congress, of course, was concerned about claims that the 
majority of pupils in the progtam were competent in English. Thus, 
when it acted on the bilingual program this year, it mandated that no 
more than 40 percent of the pupils in the prpgra*i should be children 
whose native Janguage is English. (To avoid probjlems of segregation, 
some English-speaking pupils had to be eligible to participate.) Con- 
gress also changed the description of these clilldren from "limited 
English-speaking" to "limited English proficien(^/' since speaking is 
only one factor that should be considered. In addition. Congress 
'increased its appropriation for bilingual programs to |150 million and^' 
called for additional research. The reseaKrh that will emerge, 
including; studies on entry criteria for bilingual education programs, 
exit criteria, program effectiveness, and teacher training will have art^ 
4n^pact on future appropriations. Clearly, the research available i$ 
insufficient for making important decisions. 

national testing 

A final concern I wish to address in this chapter is the alleged 
specter of a federally sponsored national competency test. The Carter 
Administration has expressed its opposition to such a federal role. 
Joseph A. Califano, Jr., Secretary of Health, Education, anfl Welfare, 
agreeing with a report from the National Academy of Education, 
stated: "A national test would improperly centralize a matter of state 
land local control" (Califano, 1978, p. 4). 

What is the federal role? I believe it should be based on how we 
can best help state and local school districts. In the session just 
adjourned. Congress authorized the U.S. Commissioner of Education 
to make grants to states and to individual school districts for imple-^ 
ment^ng educational proficiency standards and providing assistance 
with achievement tesyhg. While no money for this program has yet 
been appropriated, the debate about^this legislation is instructive. I 
recall heading late at ni|[ht the conference-committee dialogue dbn- 



'ccming the iwiic of federal controK To guard against federal influ- 
ctice, a pr^ctive statement was adopted concerning proficiency stan- 
dard assistance that says: "Nothing in this section shall aijthoi^ze' the 
commissioner to impose tests on state educational ag^^ies or local 
educational agencies, and no such agency shall be compellcfd in any 
waf to apply for funds under this section'' (Conference Report on H. 
R. 15, 1978, p. HI213I). Similarly, regarding assistance with achieve- 
meat testing, the safeguard provision stated: "Nothijfig this section 
shall authorize the commissioner to require specifit? tests or test ques- 
tions. Any state or local educational age^cy may refuse to use any test 
or test question developed under this section'* (Conference 'Report on 
H. R'i5, 1978, p, H12181), 

Congress is responding to public pressure and test results con- ^ 
cernin^ achievement levels in American sclioois. However* sensitive to 
the dangers of federal control, it has placed clear limitations on the 
Offic^ bf Education in administering the laws. 

In s^up[)marizing this brief trip through some recent decisions, I 
would make the following general conclusions: 

1. In an increasingly complex society, tests will conj/fiue to 
; have an important 4mpact on individuals, institulfions, and 

tjie decisions that ar# made about education. ^ 

2, Constant vigilance must be exercised to ensure that the fed- 
eral rble continues to be one concerned with research, tech- 
nical assistance, and funding rather than one of domination 
or control. , 

5: The educational research establishment must pxpai\d its 
methodological approa^ches from traditional reliance on psy* 
chology and statistical analysis to includ^j^t^je ui^e of the 
wider range of metl\odologies now common in other sciences. 

4. Policy makers ^ all levels rttust be willing to make intelligent 
adjustmcnj^tb\)rograms based on results. 

5. While indicased dollars and continuing authorizations are 
welcome signs, we must remind ourselves lhat the real mea- ' 
sures of success ate how well st^udents learn angd how signifi- 
cantly their life chances are improved, ^ 

conclusion ' . ^ 



The shojrt political life cycle of peQple and events in Washing- 
' ton often stresses instant success, but it should become in<:reasingly 
apparent that, in the long run, the best policy will be to support pro- 
grams th^ demonstrate positive and tangible long-term results) It is 
naive to believe that research, 1 however sophisticated^ will resolve 
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intensely political questions. Conversely, it is unnecessarily cyntca^to 
believe that research results have lijttle or no effect on legislative action. 
Congress and the administr^ion do take seriously responsible etidence 
of program effectiveness,^^ I sense a willingness to make effective use 
of the results of major studies. Critical questions remain upan^ered, 
and numerous decisions must be made about the focus of programs 
and the allocation of scarce resources. Solid research aided by refined 
measuring instruments and new methodologies will be increasingly^ 
helpful in making those decisiqns in the years ahead. 
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from the introduction 



Demands by edufational policy makers for appUcatidns of 
measurement to significant new tasks are having far reaching 
effects on education and measurement. Measurement 
professionals are beinff askfd to help «j designing educational 
'programs for. dmong others, children with learning 
disabilities, gifted childreri, and bilingual children. These, 
professionals are also being asked how measurement can help 
in allocating funds to schools, determining qualifications for 
high school diplomas and. college degrees, and evaluating the 
worth of new educational programs. These new and complex 
demands have given rise to congressional debate, federal and 
state conferences, and extensive discussion and developmental 
work, by measurement specialists. Measurement and 
educational policy is the theme of this inaugural volume of 
New Directions for Testing and Measurement, which includes 
the ten papers presented at the 1978 Educational Testing 
Service Invitational Conference. 
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